Self-supervised Auxiliary Learning for Graph Neural Networks via Meta-Learning

In recent years, graph neural networks (GNNs) have been widely adopted in representation learning of graph-structured data and provided state-of-the-art performance in various application such as link prediction and node classification. Simultaneously, self-supervised learning has been studied to some extent to leverage rich unlabeled data in representation learning on graphs. However, employing self-supervision tasks as auxiliary tasks to assist a primary task has been less explored in the literature on graphs. In this paper, we propose a novel self-supervised auxiliary learning framework to effectively learn graph neural networks. Moreover, we design first a meta-path prediction as a self-supervised auxiliary task for heterogeneous graphs. Our method is learning to learn a primary task with various auxiliary tasks to improve generalization performance. The proposed method identifies an effective combination of auxiliary tasks and automatically balances them to improve the primary task. Our methods can be applied to any graph neural networks in a plug-in manner without manual labeling or additional data. Also, it can be extended to any other auxiliary tasks. Our experiments demonstrate that the proposed method consistently improves the performance of link prediction and node classification on heterogeneous graphs.

[1]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[2]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[3]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jure Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2020, ICLR.

[7]  Qi Xie,et al.  Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting , 2019, NeurIPS.

[8]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[9]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[10]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[11]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[12]  Yizhou Sun,et al.  Mining heterogeneous information networks: a structural analysis approach , 2013, SKDD.

[13]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[14]  Amos J. Storkey,et al.  How to train your MAML , 2018, ICLR.

[15]  Liang Lu,et al.  Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition , 2017, INTERSPEECH.

[16]  Ji Geng,et al.  Meta-GNN: On Few-shot Node Classification in Graph Meta-learning , 2019, CIKM.

[17]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Jing Jiang,et al.  Learning to Propagate for Graph Meta-Learning , 2019, NeurIPS.

[19]  Nicholas Jing Yuan,et al.  Collaborative Knowledge Base Embedding for Recommender Systems , 2016, KDD.

[20]  Yanfang Ye,et al.  Heterogeneous Graph Attention Network , 2019, WWW.

[21]  Matthijs Douze,et al.  Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[24]  Jinwoo Shin,et al.  Learning What and Where to Transfer , 2019, ICML.

[25]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[26]  Jung-Woo Ha,et al.  NSML: Meet the MLaaS platform with a real-world case study , 2018, ArXiv.

[27]  Yizhou Sun,et al.  Pre-Training Graph Neural Networks for Generic Structural Feature Extraction , 2019, ArXiv.

[28]  Qiang Chen,et al.  Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs , 2019, EMNLP-IJCNLP 2019.

[29]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[30]  Tianlong Chen,et al.  When Does Self-Supervision Help Graph Convolutional Networks? , 2020, ICML.

[31]  Stefan Lee,et al.  Graph R-CNN for Scene Graph Generation , 2018, ECCV.

[32]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[34]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[35]  Vikas Singh,et al.  Tensorize, Factorize and Regularize: Robust Visual Relationship Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[37]  Katja Hofmann,et al.  Fast Context Adaptation via Meta-Learning , 2018, ICML.

[38]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[39]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[40]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[41]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[43]  John Flynn,et al.  Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[45]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[46]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[47]  Jure Leskovec,et al.  Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems , 2019, KDD.

[48]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[49]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[50]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[51]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[52]  Suhang Wang,et al.  Self-supervised Learning on Graphs: Deep Insights and New Direction , 2020, ArXiv.

[53]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[54]  Jung-Woo Ha,et al.  NSML: A Machine Learning Platform That Enables You to Focus on Your Models , 2017, ArXiv.

[55]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[56]  Yizhou Sun,et al.  Personalized entity recommendation: a heterogeneous information network approach , 2014, WSDM.

[57]  Mo Yu,et al.  One-Shot Relational Learning for Knowledge Graphs , 2018, EMNLP.

[58]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[59]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[60]  Seungjin Choi,et al.  Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace , 2018, ICML.

[61]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[62]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[63]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[64]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[65]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[66]  Minyi Guo,et al.  RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems , 2018, CIKM.

[67]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[68]  Alessandro Sperduti,et al.  Pre-training Graph Neural Networks with Kernels , 2018, ArXiv.

[69]  Ankit Jain,et al.  Meta-Graph: Few shot Link Prediction via Meta Learning , 2019, ArXiv.

[70]  Steffen Rendle,et al.  Factorization Machines with libFM , 2012, TIST.

[71]  Jaewoo Kang,et al.  Graph Transformer Networks , 2019, NeurIPS.

[72]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.