Learning Invariant Graph Representations for Out-of-Distribution Generalization

Graph representation learning has shown effectiveness when testing and training graph data come from the same distribution, but most existing approaches fail to generalize under distribution shifts. Invariant learning, backed by the invariance principle from causality, can achieve guaranteed generalization under distribution shifts in theory and has shown great successes in practice. However, invariant learning for graphs under distribution shifts remains unexplored and challenging. To solve this problem, we propose Graph Invariant Learning ( GIL ) model capable of learning generalized graph representations under distribution shifts. Our proposed method can capture the invariant relationships between predictive graph structural information and labels in a mixture of latent environments through jointly optimizing three tailored modules. Specifically, we first design a GNN-based subgraph generator to identify invariant subgraphs. Then we use the variant subgraphs, i.e., complements of invariant subgraphs, to infer the latent environment labels. We further propose an invariant learning module to learn graph representations that can generalize to unknown test graphs. Theoretical justifications for our proposed method are also provided. Extensive experiments on both synthetic and real-world datasets demonstrate the superiority of our method against state-of-the-art baselines under distribution shifts for the graph classification task.

[1]  Xin Wang,et al.  Disentangled Graph Contrastive Learning With Independence Promotion , 2023, IEEE Transactions on Knowledge and Data Engineering.

[2]  Pang Wei Koh,et al.  Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time , 2022, NeurIPS.

[3]  Xin Wang,et al.  Learning to Solve Travelling Salesman Problem with Hardness-adaptive Curriculum , 2022, AAAI.

[4]  Wenwu Zhu,et al.  Disentangled Representation Learning for Recommendation , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Wenwu Zhu,et al.  Out-Of-Distribution Generalization on Graphs: A Survey , 2022, ArXiv.

[6]  Junchi Yan,et al.  Handling Distribution Shifts on Graphs: An Invariance Perspective , 2022, ICLR.

[7]  Pan Li,et al.  Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism , 2022, ICML.

[8]  Xiangnan He,et al.  Discovering Invariant Rationales for Graph Neural Networks , 2022, ICLR.

[9]  Kun Kuang,et al.  Debiased Graph Neural Networks With Agnostic Label Selection Bias , 2022, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Xin Wang,et al.  OOD-GNN: Out-of-Distribution Generalized Graph Neural Network , 2021, IEEE Transactions on Knowledge and Data Engineering.

[11]  Wenwu Zhu,et al.  GQNAS: Graph Q Network for Neural Architecture Search , 2021, Industrial Conference on Data Mining.

[12]  Qianru Sun,et al.  Causal Attention for Unbiased Visual Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Bryan Perozzi,et al.  Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data , 2021, NeurIPS.

[14]  Yu Guang Wang,et al.  Weisfeiler and Lehman Go Cellular: CW Networks , 2021, NeurIPS.

[15]  Yoshua Bengio,et al.  Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization , 2021, NeurIPS.

[16]  Peng Cui,et al.  Heterogeneous Risk Minimization , 2021, ICML.

[17]  Bruno Ribeiro,et al.  Size-Invariant Graph Representations for Graph Classification Extrapolations , 2021, ICML.

[18]  Wenwu Zhu,et al.  Intention-Aware Sequential Recommendation With Structured Intent Transition , 2021, IEEE Transactions on Knowledge and Data Engineering.

[19]  Shuiwang Ji,et al.  Explainability in Graph Neural Networks: A Taxonomic Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Renjie Liao,et al.  A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks , 2020, ICLR.

[21]  Bo Zong,et al.  Parameterized Explainer for Graph Neural Network , 2020, NeurIPS.

[22]  Eli A. Meirom,et al.  From Local Structures to Size Generalization in Graph Neural Networks , 2020, ICML.

[23]  R. Zemel,et al.  Environment Inference for Invariant Learning , 2020, ICML.

[24]  Ken-ichi Kawarabayashi,et al.  How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks , 2020, ICLR.

[25]  Buyue Qian,et al.  Graph Neural Network-Based Diagnosis Prediction , 2020, Big Data.

[26]  Masanori Koyama,et al.  Out-of-Distribution Generalization with Maximal Invariant Predictor , 2020, ArXiv.

[27]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[28]  Tommi S. Jaakkola,et al.  Invariant Rationalization , 2020, ICML.

[29]  Aaron C. Courville,et al.  Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.

[30]  Stefanie Jegelka,et al.  Generalization and Representational Limits of Graph Neural Networks , 2020, ICML.

[31]  Tatsunori B. Hashimoto,et al.  Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization , 2019, ArXiv.

[32]  P. Talukdar,et al.  ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations , 2019, AAAI.

[33]  Tatsuya Harada,et al.  Domain Generalization Using a Mixture of Multiple Latent Domains , 2019, AAAI.

[34]  Zhongyu Wei,et al.  Using External Knowledge for Financial Event Prediction Based on Graph Neural Networks , 2019, CIKM.

[35]  Wenwu Zhu,et al.  Fates of Microscopic Social Ecosystems: Keep Alive or Dead? , 2019, KDD.

[36]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[37]  Wenwu Zhu,et al.  Disentangled Graph Convolutional Networks , 2019, ICML.

[38]  Shuiwang Ji,et al.  Graph U-Nets , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Zhi-Li Zhang,et al.  Stability and Generalization of Graph Convolutional Neural Networks , 2019, KDD.

[40]  Mohamed R. Amer,et al.  Understanding Attention and Generalization in Graph Neural Networks , 2019, NeurIPS.

[41]  Jaewoo Kang,et al.  Self-Attention Graph Pooling , 2019, ICML.

[42]  J. Leskovec,et al.  GNNExplainer: Generating Explanations for Graph Neural Networks , 2019, NeurIPS.

[43]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[44]  Ah Chung Tsoi,et al.  The Vapnik-Chervonenkis dimension of graph and recursive neural networks , 2018, Neural Networks.

[45]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[46]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[47]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[48]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[49]  Bernhard Schölkopf,et al.  Invariant Models for Causal Transfer Learning , 2015, J. Mach. Learn. Res..

[50]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[51]  Zeyang Zhang,et al.  Dynamic Graph Neural Networks Under Spatio-Temporal Distribution Shift ( , 2022 .

[52]  P. Xie,et al.  Graph Neural Architecture Search Under Distribution Shifts , 2022, ICML.

[53]  Wenwu Zhu,et al.  Disentangled Contrastive Learning on Graphs , 2021, NeurIPS.

[54]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[55]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.