GMNN: Graph Markov Neural Networks

This paper studies semi-supervised object classification in relational data, which is a fundamental problem in relational data modeling. The problem has been extensively studied in the literature of both statistical relational learning (e.g. relational Markov networks) and graph neural networks (e.g. graph convolutional networks). Statistical relational learning methods can effectively model the dependency of object labels through conditional random fields for collective classification, whereas graph neural networks learn effective object representations for classification through end-to-end training. In this paper, we propose the Graph Markov Neural Network (GMNN) that combines the advantages of both worlds. A GMNN models the joint distribution of object labels with a conditional random field, which can be effectively trained with the variational EM algorithm. In the E-step, one graph neural network learns effective object representations for approximating the posterior distributions of object labels. In the M-step, another graph neural network is used to model the local label dependency. Experiments on object classification, link classification, and unsupervised node representation learning show that GMNN achieves state-of-the-art results.

[1]  Alexander M. Rush,et al.  Latent Alignment and Variational Attention , 2018, NeurIPS.

[2]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[3]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[4]  Pasquale Minervini,et al.  Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[5]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[6]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[7]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[8]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[9]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[10]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[11]  Ben Taskar,et al.  Relational Markov Networks , 2007 .

[12]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[13]  Christos Faloutsos,et al.  Edge Weight Prediction in Weighted Signed Networks , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[14]  Christos Faloutsos,et al.  REV2: Fraudulent User Prediction in Rating Platforms , 2018, WSDM.

[15]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[16]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[17]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[18]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[19]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[20]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[21]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[22]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[23]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[24]  Ben Taskar,et al.  Learning Probabilistic Models of Relational Structure , 2001, ICML.

[25]  Hugo Larochelle,et al.  Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[26]  Pedro M. Domingos,et al.  Discriminative Training of Markov Logic Networks , 2005, AAAI.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Avi Pfeffer,et al.  Probabilistic Frame-Based Systems , 1998, AAAI/IAAI.

[29]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[30]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[31]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[32]  Ben Taskar,et al.  Probabilistic Models of Text and Link Structure for Hypertext Classification , 2001 .

[33]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[34]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[35]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[36]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.

[37]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[38]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[39]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[40]  Lisa Zhang,et al.  Inference in Probabilistic Graphical Models by Graph Neural Networks , 2018, 2019 53rd Asilomar Conference on Signals, Systems, and Computers.

[41]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.