Discriminative Nonparametric Latent Feature Relational Models with Data Augmentation

We present a discriminative nonparametric latent feature relational model (LFRM) for link prediction to automatically infer the dimensionality of latent features. Under the generic RegBayes (regularized Bayesian inference) framework, we handily incorporate the prediction loss with probabilistic inference of a Bayesian model; set distinct regularization parameters for different types of links to handle the imbalance issue in real networks; and unify the analysis of both the smooth logistic log-loss and the piecewise linear hinge loss. For the nonconjugate posterior inference, we present a simple Gibbs sampler via data augmentation, without making restricting assumptions as done in variational methods. We further develop an approximate sampler using stochastic gradient Langevin dynamics to handle large networks with hundreds of thousands of entities and millions of links, orders of magnitude larger than what existing LFRM models can process. Extensive studies on various real networks show promising performance.

[1]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[2]  Ning Chen,et al.  Discriminative Relational Topic Models , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[4]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[5]  Peter D. Hoff,et al.  Modeling homophily and stochastic equivalence in symmetric relational data , 2007, NIPS.

[6]  Nicholas G. Polson,et al.  Data augmentation for support vector machines , 2011 .

[7]  Ning Chen,et al.  Gibbs max-margin topic models with data augmentation , 2013, J. Mach. Learn. Res..

[8]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[9]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[10]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[11]  Jun Zhu,et al.  Robust RegBayes: Selectively Incorporating First-Order Logic Domain Knowledge into Bayesian Models , 2014, ICML.

[12]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[13]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[14]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[15]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[16]  ChenNing,et al.  Gibbs max-margin topic models with data augmentation , 2014 .

[17]  Y. Shao,et al.  Asymptotics for likelihood ratio tests under loss of identifiability , 2003 .

[18]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[19]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[20]  Jun Zhu,et al.  Max-Margin Nonparametric Latent Feature Models for Link Prediction , 2012, ICML.

[21]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[22]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[23]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[24]  David M. Blei,et al.  Efficient Online Inference for Bayesian Nonparametric Relational Models , 2013, NIPS.

[25]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[26]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[27]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[28]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[29]  Jun Zhu,et al.  User grouping behavior in online forums , 2009, KDD.

[30]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[31]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[32]  Cecilia Mascolo,et al.  Exploiting place features in link prediction on location-based social networks , 2011, KDD.

[33]  Ning Chen,et al.  Bayesian inference with posterior regularization and applications to infinite latent SVMs , 2012, J. Mach. Learn. Res..

[34]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .