How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision

Attention mechanism in graph neural networks is designed to assign larger weights to important neighbor nodes for better representation. However, what graph attention learns is not understood well, particularly when graphs are noisy. In this paper, we propose a self-supervised graph attention network (SuperGAT), an improved graph attention model for noisy graphs. Specifically, we exploit two attention forms compatible with a self-supervised task to predict edges, whose presence and absence contain the inherent information about the importance of the relationships between nodes. By encoding edges, SuperGAT learns more expressive attention in distinguishing mislinked neighbors. We find two graph characteristics influence the effectiveness of attention forms and self-supervision: homophily and average degree. Thus, our recipe provides guidance on which attention design to use when those two graph characteristics are known. Our experiment on 17 real-world datasets demonstrates that our recipe generalizes across 15 datasets of them, and our models designed by recipe show improved performance over baselines.

[1]  Andrew Tomkins,et al.  Graph Agreement Models for Semi-Supervised Learning , 2019, NeurIPS.

[2]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[3]  Kevin Chen-Chuan Chang,et al.  Geom-GCN: Geometric Graph Convolutional Networks , 2020, ICLR.

[4]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[5]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[6]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[7]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  Fan Chung,et al.  Graph Theory in the Information Age , 2010 .

[10]  P'eter Mernyei,et al.  Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks , 2020, ArXiv.

[11]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[12]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[13]  Shuiwang Ji,et al.  Graph Representation Learning via Hard and Channel-Wise Attention Networks , 2019, KDD.

[14]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[16]  Jongwook Choi,et al.  Supervising Neural Attention Models for Video Captioning by Human Gaze Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[18]  Nir Shavit,et al.  Deep Learning is Robust to Massive Label Noise , 2017, ArXiv.

[19]  Tianlong Chen,et al.  When Does Self-Supervision Help Graph Convolutional Networks? , 2020, ICML.

[20]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[21]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[22]  Rik Sarkar,et al.  Multi-scale Attributed Node Embedding , 2019, J. Complex Networks.

[23]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[24]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[25]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[26]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[27]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[28]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[29]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[30]  Jure Leskovec,et al.  Improving Graph Attention Networks with Large Margin-based Constraints , 2019, ArXiv.

[31]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Zhanxing Zhu,et al.  Multi-Stage Self-Supervised Learning for Graph Convolutional Networks , 2020, AAAI.

[33]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[34]  Jure Leskovec,et al.  Image Labeling on a Network: Using Social-Network Metadata for Image Classification , 2012, ECCV.

[35]  Chong Wang,et al.  Attention-based Graph Neural Network for Semi-supervised Learning , 2018, ArXiv.

[36]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[37]  Mohamed R. Amer,et al.  Understanding Attention and Generalization in Graph Neural Networks , 2019, NeurIPS.

[38]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[39]  Hao Ma,et al.  GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs , 2018, UAI.

[40]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[41]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[42]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[43]  Jun Wang,et al.  Adaptive Structural Fingerprints for Graph Attention Networks , 2020, ICLR.

[44]  Bo Zong,et al.  Robust Graph Representation Learning via Neural Sparsification , 2020, ICML.

[45]  Massimiliano Pontil,et al.  Learning Discrete Structures for Graph Neural Networks , 2019, ICML.

[46]  Stephan Günnemann,et al.  Diffusion Improves Graph Learning , 2019, NeurIPS.

[47]  Qinghua Hu,et al.  Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning , 2020, AAAI.

[48]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[49]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[50]  Alex Smola,et al.  Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[51]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[52]  Jure Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2020, ICLR.

[53]  Lina Yao,et al.  Adversarially Regularized Graph Autoencoder , 2018, IJCAI.

[54]  Yixin Chen,et al.  Weisfeiler-Lehman Neural Machine for Link Prediction , 2017, KDD.

[55]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[56]  Hyung Jin Chang,et al.  Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[57]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[58]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[59]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[60]  Hongzhi Chen,et al.  Measuring and Improving the Use of Graph Information in Graph Neural Networks , 2020, ICLR.

[61]  Jure Leskovec,et al.  Predicting multicellular function through multi-layer tissue networks , 2017, Bioinform..

[62]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[63]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[64]  Yuxiao Dong,et al.  Microsoft Academic Graph: When experts are not enough , 2020, Quantitative Science Studies.

[65]  Bin Luo,et al.  Semi-Supervised Learning With Graph Learning-Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[67]  Donald F. Towsley,et al.  Diffusion-Convolutional Neural Networks , 2015, NIPS.

[68]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.