Generating Realistic Labelled, Weighted Random Graphs

Generative algorithms for random graphs have yielded insights into the structure and evolution of real-world networks. Most networks exhibit a well-known set of properties, such as heavy-tailed degree distributions, clustering and community formation. Usually, random graph models consider only structural information, but many real-world networks also have labelled vertices and weighted edges. In this paper, we present a generative model for random graphs with discrete vertex labels and numeric edge weights. The weights are represented as a set of Beta Mixture Models (BMMs) with an arbitrary number of mixtures, which are learned from real-world networks. We propose a Bayesian Variational Inference (VI) approach, which yields an accurate estimation while keeping computation times tractable. We compare our approach to state-of-the-art random labelled graph generators and an earlier approach based on Gaussian Mixture Models (GMMs). Our results allow us to draw conclusions about the contribution of vertex labels and edge weights to graph structure.

[1]  Yuan Ji,et al.  Applications of beta-mixture models in bioinformatics , 2005, Bioinform..

[2]  Arne Leijon,et al.  Beta mixture models and the application to image classification , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[3]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[4]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[5]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[6]  Jure Leskovec,et al.  Modeling Social Networks with Node Attributes using the Multiplicative Attribute Graph Model , 2011, UAI.

[7]  B. Bollobás The evolution of random graphs , 1984 .

[8]  Donald B. Johnson,et al.  Efficient Algorithms for Shortest Paths in Sparse Networks , 1977, J. ACM.

[9]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[11]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[12]  Klemens Böhm,et al.  On the Usefulness of Weight-Based Constraints in Frequent Subgraph Mining , 2010, SGAI Conf..

[13]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[14]  Jonas Samuelsson,et al.  Bounded support Gaussian mixture modeling of speech spectra , 2003, IEEE Trans. Speech Audio Process..

[15]  G. Fagiolo Clustering in complex directed networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Christos Faloutsos,et al.  Graph Mining: Laws, Tools, and Case Studies , 2012, Synthesis Lectures on Data Mining and Knowledge Discovery.

[17]  Nizar Bouguila,et al.  Practical Bayesian estimation of a finite beta mixture through gibbs sampling and its applications , 2006, Stat. Comput..

[18]  Weiru Liu,et al.  Agwan: A Generative Model for Labelled, Weighted Graphs , 2013, NFMCP.

[19]  Ruth F. Hunter,et al.  The Physical Activity Loyalty Card Scheme: Development and Application of a Novel System for Incentivizing Behaviour Change , 2011, eHealth.

[20]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[21]  Markus Flierl,et al.  Bayesian estimation of Dirichlet mixture model with variational inference , 2014, Pattern Recognit..

[22]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[23]  Arne Leijon,et al.  Bayesian Estimation of Beta Mixture Models with Variational Inference , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[25]  Max Welling,et al.  Accelerated Variational Dirichlet Process Mixtures , 2006, NIPS.

[26]  Weiru Liu,et al.  Finding the most descriptive substructures in graphs with discrete and numeric labels , 2012, Journal of Intelligent Information Systems.

[27]  Naonori Ueda,et al.  Bayesian model search for mixture models based on optimizing variational bounds , 2002, Neural Networks.

[28]  Jure Leskovec,et al.  Multiplicative Attribute Graph Model of Real-World Networks , 2010, Internet Math..

[29]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[30]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[31]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[32]  Michel L. Goldstein,et al.  Problems with fitting to the power-law distribution , 2004, cond-mat/0402322.

[33]  P. Diaconis,et al.  Conjugate Priors for Exponential Families , 1979 .

[34]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[36]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[37]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.