Graph Representation Learning via Graphical Mutual Information Maximization

The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs—an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.

[1]  Xavier Bresson,et al.  A Two-Step Graph Convolutional Decoder for Molecule Generation , 2019, ArXiv.

[2]  Nikhil Ketkar,et al.  Introduction to PyTorch , 2021, Deep Learning with Python.

[3]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[4]  S. Varadhan,et al.  Asymptotic evaluation of certain Markov process expectations for large time , 1975 .

[5]  Aaron C. Courville,et al.  MINE: Mutual Information Neural Estimation , 2018, ArXiv.

[6]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[7]  KleinbergJon,et al.  The link-prediction problem for social networks , 2007 .

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  Aapo Hyvärinen,et al.  Nonlinear independent component analysis: Existence and uniqueness results , 1999, Neural Networks.

[10]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[11]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[12]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[13]  Jure Leskovec,et al.  Predicting multicellular function through multi-layer tissue networks , 2017, Bioinform..

[14]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Mark Coates,et al.  Bayesian graph convolutional neural networks for semi-supervised classification , 2018, AAAI.

[19]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[20]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[21]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[22]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[23]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[24]  Jingrui He,et al.  DEMO-Net: Degree-specific Graph Neural Networks for Node and Graph Classification , 2019, KDD.

[25]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[26]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[29]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Dongyan Zhao,et al.  Jointly Learning Entity and Relation Representations for Entity Alignment , 2019, EMNLP.

[31]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[32]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[33]  Yoshua Bengio,et al.  GMNN: Graph Markov Neural Networks , 2019, ICML.

[34]  Esben Jannik Bjerrum,et al.  Molecular Generation with Recurrent Neural Networks (RNNs) , 2017, ArXiv.

[35]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[36]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[37]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[38]  Mathias Niepert,et al.  Learning Graph Representations with Embedding Propagation , 2017, NIPS.

[39]  Hao Ma,et al.  GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs , 2018, UAI.

[40]  Luís B. Almeida,et al.  MISEP -- Linear and Nonlinear ICA Based on Mutual Information , 2003, J. Mach. Learn. Res..

[41]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[42]  Ueli Maurer,et al.  About the mutual (conditional) information , 2002, Proceedings IEEE International Symposium on Information Theory,.

[43]  Jie Zhang,et al.  Semi-supervised Learning on Graphs with Generative Adversarial Nets , 2018, CIKM.

[44]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[45]  Mark Heimann,et al.  REGAL: Representation Learning-based Graph Alignment , 2018, CIKM.

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[48]  Otto Voggenreiter,et al.  Graph Alignment Networks with Node Matching Scores , 2019 .

[49]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[50]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[51]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[52]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[53]  Yuting Wu,et al.  Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs , 2019, IJCAI.

[54]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.