Random function priors for exchangeable arrays with applications to graphs and relational data

A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to the ordering of rows and columns corresponds to exchangeable arrays. Results in probability theory due to Aldous, Hoover and Kallenberg show that exchangeable arrays can be represented in terms of a random measurable function which constitutes the natural model parameter in a Bayesian model. We obtain a flexible yet simple Bayesian nonparametric model by placing a Gaussian process prior on the parameter function. Efficient inference utilises elliptical slice sampling combined with a random sparse approximation to the Gaussian process. We demonstrate applications of the model to network data and clarify its relation to models in the literature, several of which emerge as special cases.

[1]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[2]  Neil D. Lawrence,et al.  Efficient Sampling for Gaussian Process Inference using Control Variables , 2008, NIPS.

[3]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[4]  D. Aldous Probability and Mathematical Genetics: More uses of exchangeability: representations of complex random structures , 2009, 0909.4339.

[5]  Noah A. Smith,et al.  Advances in Neural Information Processing Systems 21 (NIPS 2008) , 2009 .

[6]  László Lovász,et al.  Limits of dense graph sequences , 2004, J. Comb. Theory B.

[7]  P. Green,et al.  Probability and Mathematical Genetics: Papers in Honour of Sir John Kingman , 2010 .

[8]  O. Kallenberg Probabilistic Symmetries and Invariance Principles , 2005 .

[9]  Zenglin Xu,et al.  Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis , 2011, ICML.

[10]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[11]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[12]  Zenglin Xu,et al.  Sparse matrix-variate Gaussian process blockmodels for network modeling , 2011, UAI.

[13]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[14]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[15]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[16]  Olav Kallenberg,et al.  Multivariate Sampling and the Estimation Problem for Exchangeable Arrays , 1999 .

[17]  Dong Xiang,et al.  The Bias-Variance Tradeoff and the Randomized GACV , 1998, NIPS.

[18]  Edoardo M. Airoldi,et al.  Stochastic Block Models of Mixed Membership , 2006 .

[19]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[20]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[21]  Andreas Krause,et al.  Advances in Neural Information Processing Systems (NIPS) , 2014 .

[22]  Yee Whye Teh,et al.  The Mondrian Process , 2008, NIPS.

[23]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[24]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[25]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[26]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[27]  Peter D. Hoff,et al.  Modeling homophily and stochastic equivalence in symmetric relational data , 2007, NIPS.

[28]  Olav Kallenberg,et al.  Symmetries on random arrays and set-indexed processes , 1992 .

[29]  D. Aldous Representations for partially exchangeable arrays of random variables , 1981 .

[30]  Zoubin Ghahramani,et al.  An Infinite Latent Attribute Model for Network Data , 2012, ICML.