Nonparametric Network Models for Link Prediction

Many data sets can be represented as a sequence of interactions between entities--for example communications between individuals in a social network, protein-protein interactions or DNA-protein interactions in a biological context, or vehicles' journeys between cities. In these contexts, there is often interest in making predictions about future interactions, such as who will message whom. A popular approach to network modeling in a Bayesian context is to assume that the observed interactions can be explained in terms of some latent structure. For example, tra_c patterns might be explained by the size and importance of cities, and social network interactions might be explained by the social groups and interests of individuals. Unfortunately, while elucidating this structure can be useful, it often does not directly translate into an effective predictive tool. Further, many existing approaches are not appropriate for sparse networks, a class that includes many interesting real-world situations. In this paper, we develop models for sparse networks that combine structure elucidation with predictive performance. We use a Bayesian nonparametric approach, which allows us to predict interactions with entities outside our training set, and allows the both the latent dimensionality of the model and the number of nodes in the network to grow in expectation as we see more data. We demonstrate that we can capture latent structure while maintaining predictive power, and discuss possible extensions.

[1]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[2]  S. Walker,et al.  Investigating nonparametric priors with Gibbs structure , 2008 .

[3]  Emily B. Fox,et al.  Sparse graphs using exchangeable random measures , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[4]  Walter Dempsey,et al.  Edge exchangeable models for network data , 2016, ArXiv.

[5]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[6]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[7]  Alfred O. Hero,et al.  Dynamic Stochastic Blockmodels: Statistical Models for Time-Evolving Networks , 2013, SBP.

[8]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[9]  Naonori Ueda,et al.  Dynamic Infinite Relational Model for Time-varying Relational Data Analysis , 2010, NIPS.

[10]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[11]  Daniel M. Roy,et al.  Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[13]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[14]  Eric P. Xing,et al.  On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks , 2012, NIPS.

[15]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[16]  E. Xing,et al.  A state-space mixed membership blockmodel for dynamic network tomography , 2008, 0901.0135.

[17]  Zoubin Ghahramani,et al.  Random function priors for exchangeable arrays with applications to graphs and relational data , 2012, NIPS.

[18]  St'ephane Robin,et al.  Uncovering latent structure in valued graphs: A variational approach , 2010, 1011.1813.

[19]  Michael I. Jordan,et al.  The Sticky HDP-HMM: Bayesian Nonparametric Hidden Markov Models with Persistent States , 2009 .

[20]  A. Brix Generalized Gamma measures and shot-noise Cox processes , 1999, Advances in Applied Probability.

[21]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[22]  Daniel M. Roy,et al.  The Class of Random Graphs Arising from Exchangeable Random Measures , 2015, ArXiv.

[23]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.

[24]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[25]  Eric P. Xing,et al.  Restricting exchangeable nonparametric distributions , 2012, NIPS.

[26]  W. Eric L. Grimson,et al.  Construction of Dependent Dirichlet Processes based on Poisson Processes , 2010, NIPS.

[27]  Peter J. Bickel,et al.  Pseudo-likelihood methods for community detection in large sparse networks , 2012, 1207.2340.