Debiased Graph Neural Networks With Agnostic Label Selection Bias

Most existing Graph Neural Networks (GNNs) are proposed without considering the selection bias in data, i.e., the inconsistent distribution between the training set with test set. In reality, the test data is not even available during the training process, making selection bias agnostic. Training GNNs with biased selected nodes leads to significant parameter estimation bias and greatly impacts the generalization ability on test nodes. In this paper, we first present an experimental investigation, which clearly shows that the selection bias drastically hinders the generalization ability of GNNs, and theoretically prove that the selection bias will cause the biased estimation on GNN parameters. Then to remove the bias in GNN estimation, we propose a novel Debiased Graph Neural Networks (DGNN) with a differentiated decorrelation regularizer. The differentiated decorrelation regularizer estimates a sample weight for each labeled node such that the spurious correlation of learned embeddings could be eliminated. We analyze the regularizer in causal view and it motivates us to differentiate the weights of the variables based on their contribution on the confounding bias. Then, these sample weights are used for reweighting GNNs to eliminate the estimation bias, thus help to improve the stability of prediction on unknown test nodes. Comprehensive experiments are conducted on several challenging graph datasets with two kinds of label selection biases. The results well verify that our proposed model outperforms the state-of-the-art methods and DGNN is a flexible framework to enhance existing GNNs.

[1]  Philip S. Yu,et al.  Knowledge-Preserving Incremental Social Event Detection via Heterogeneous GNNs , 2021, WWW.

[2]  Jieping Ye,et al.  Graph-Based Semi-Supervised Learning with Non-ignorable Non-response , 2019, NeurIPS.

[3]  Xiao Wang,et al.  One2Multi Graph Autoencoder for Multi-view Graph Clustering , 2020, WWW.

[4]  Tong Zhang,et al.  Stable Learning via Differentiated Variable Decorrelation , 2020, KDD.

[5]  F. Xavier Roca,et al.  Regularizing CNNs with Locally Constrained Decorrelations , 2016, ICLR.

[6]  Stephan Günnemann,et al.  Predict then Propagate: Graph Neural Networks meet Personalized PageRank , 2018, ICLR.

[7]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[8]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[9]  Bo Li,et al.  Stable Prediction across Unknown Environments , 2018, KDD.

[10]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[11]  Zongpeng Li,et al.  Removing the Feature Correlation Effect of Multiplicative Noise , 2018, NeurIPS.

[12]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[13]  Hongxia Yang,et al.  Counterfactual Prediction for Bundle Treatment , 2020, NeurIPS.

[14]  Simo Puntanen,et al.  A property of partitioned generalized regression , 1992 .

[15]  Richard Grieve,et al.  Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury , 2015, Health economics.

[16]  Yongliang Li,et al.  Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation , 2019, KDD.

[17]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[18]  Edwin R. Hancock,et al.  Learning Backtrackless Aligned-Spatial Graph Convolutional Networks for Graph Classification , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Kun Kuang,et al.  Stable Prediction with Model Misspecification and Agnostic Distribution Shift , 2020, AAAI.

[20]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[21]  Zhen Yang,et al.  Decorrelation of Neutral Vector Variables: Theory and Applications , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Ross B. Girshick,et al.  Reducing Overfitting in Deep Networks by Decorrelating Representations , 2015, ICLR.

[23]  Ziyang Liu,et al.  BiTe-GCN: A New GCN Architecture via BidirectionalConvolution of Topology and Features on Text-Rich Networks , 2020, ArXiv.

[24]  Kun Kuang,et al.  Data-Driven Variable Decomposition for Treatment Effect Estimation , 2022, IEEE Transactions on Knowledge and Data Engineering.

[25]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[26]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[27]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[28]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[29]  Weixiong Zhang,et al.  Graph Convolutional Networks Meet Markov Random Fields: Semi-Supervised Community Detection in Attribute Networks , 2019, AAAI.

[30]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[31]  Bin Wang,et al.  Exploratory Adversarial Attacks on Graph Neural Networks , 2020, 2020 IEEE International Conference on Data Mining (ICDM).

[32]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[33]  Judea Pearl,et al.  Causal Inference , 2010 .

[34]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[35]  Bai Wang,et al.  Decorrelated Clustering with Data Selection Bias , 2020, ArXiv.

[36]  Y. Nakatsukasa Absolute and relative Weyl theorems for generalized eigenvalue problems , 2010 .

[37]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[38]  Kun Kuang,et al.  Stable Learning via Sample Reweighting , 2019, AAAI.