Mixture models and exploratory analysis in networks

Networks are widely used in the biological, physical, and social sciences as a concise mathematical representation of the topology of systems of interacting components. Understanding the structure of these networks is one of the outstanding challenges in the study of complex systems. Here we describe a general technique for detecting structural features in large-scale network data that works by dividing the nodes of a network into classes such that the members of each class have similar patterns of connection to other nodes. Using the machinery of probabilistic mixture models and the expectation–maximization algorithm, we show that it is possible to detect, without prior knowledge of what we are looking for, a very broad range of types of structure in networks. We give a number of examples demonstrating how the method can be used to shed light on the properties of real-world networks, including social and information networks.

[1]  M. Handcock,et al.  An assessment of preferential attachment as a mechanism for human sexual network formation , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[2]  J. Moody Race, School Integration, and Friendship Segregation in America1 , 2001, American Journal of Sociology.

[3]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[4]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[5]  Cristopher Moore,et al.  Structural Inference of Hierarchies in Networks , 2006, SNA@ICML.

[6]  John Scott What is social network analysis , 2010 .

[7]  John Scott Social Network Analysis , 1988 .

[8]  M. Hastings Community detection as an inference problem. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[10]  R Pastor-Satorras,et al.  Dynamical and correlation properties of the internet. , 2001, Physical review letters.

[11]  J. A. Rodríguez-Velázquez,et al.  Spectral measures of bipartivity in complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Malik Magdon-Ismail,et al.  Efficient Identification of Overlapping Communities , 2005, ISI.

[13]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[14]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[15]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[16]  Stefan Bornholdt,et al.  Detecting fuzzy community structures in complex networks with a Potts model. , 2004, Physical review letters.

[17]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[18]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[19]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[20]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[21]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[25]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[26]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[27]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[28]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  R. A. Boyles On the Convergence of the EM Algorithm , 1983 .