A mixture model for random graphs

Abstract The Erdös–Rényi model of a network is simple and possesses many explicit expressions for average and asymptotic properties, but it does not fit well to real-world networks. The vertices of those networks are often structured in unknown classes (functionally related proteins or social communities) with different connectivity properties. The stochastic block structures model was proposed for this purpose in the context of social sciences, using a Bayesian approach. We consider the same model in a frequentest statistical framework. We give the degree distribution and the clustering coefficient associated with this model, a variational method to estimate its parameters and a model selection criterion to select the number of classes. This estimation procedure allows us to deal with large networks containing thousands of vertices. The method is used to uncover the modular structure of a network of enzymatic reactions.

[1]  A. Arkin,et al.  Biological networks. , 2003, Current opinion in structural biology.

[2]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[3]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[4]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[5]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[6]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[7]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[8]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[9]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Gérard Govaert,et al.  An EM algorithm for the block mixture model , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Masanori Arita The metabolic world of Escherichia coli is not small. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  S. L. Wong,et al.  Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network , 2005, Journal of biology.

[15]  J. Weitzman,et al.  Dishevelled nuclear shuttling , 2005, Journal of biology.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  J. Doyle,et al.  Some protein interaction data do not exhibit power law statistics , 2005, FEBS letters.

[18]  M. Handcock,et al.  Likelihood-based inference for stochastic models of sexual network formation. , 2004, Theoretical population biology.

[19]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.