Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

We investigate the representation power of graph neural networks in the semi-supervised node classification task under heterophily or low homophily, i.e., in networks where connected nodes may have different class labels and dissimilar features. Many popular GNNs fail to generalize to this setting, and are even outperformed by models that ignore the graph structure (e.g., multilayer perceptrons). Motivated by this limitation, we identify a set of key designs -- ego- and neighbor-embedding separation, higher-order neighborhoods, and combination of intermediate representations -- that boost learning from the graph structure under heterophily. We combine them into a graph neural network, H2GCN, which we use as the base method to empirically evaluate the effectiveness of the identified designs. Going beyond the traditional benchmarks with strong homophily, our empirical analysis shows that the identified designs increase the accuracy of GNNs by up to 40% and 27% over models without them on synthetic and real networks with heterophily, respectively, and yield competitive performance under homophily.

[1]  Lise Getoor,et al.  Query-driven Active Surveying for Collective Classification , 2012 .

[2]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[3]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[4]  Danai Koutra,et al.  Graph Neural Networks with Heterophily , 2020, ArXiv.

[5]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[6]  Stephan Günnemann,et al.  Diffusion Improves Graph Learning , 2019, NeurIPS.

[7]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[8]  Johan Ugander,et al.  Decoupled Smoothing on Graphs , 2019, WWW.

[9]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[10]  Danai Koutra,et al.  Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms , 2011, ECML/PKDD.

[11]  Austin R. Benson,et al.  Residual Correlation in Graph Neural Network Regression , 2020, KDD.

[12]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[13]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[14]  Si Zhang,et al.  Graph convolutional networks: a comprehensive review , 2019, Computational Social Networks.

[15]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[16]  Markus Strohmaier,et al.  Visibility of minorities in social networks , 2017, ArXiv.

[17]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[18]  Yoshua Bengio,et al.  GMNN: Graph Markov Neural Networks , 2019, ICML.

[19]  Danai Koutra,et al.  Linearized and Single-Pass Belief Propagation , 2014, Proc. VLDB Endow..

[20]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[21]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[22]  Andrew Tomkins,et al.  Graph Agreement Models for Semi-Supervised Learning , 2019, NeurIPS.

[23]  Hongzhi Chen,et al.  Measuring and Improving the Use of Graph Information in Graph Neural Networks , 2020, ICLR.

[24]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[25]  L. Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[26]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[27]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[28]  Leto Peel,et al.  Graph-based semi-supervised learning for relational networks , 2016, SDM.

[29]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[30]  Rik Sarkar,et al.  Multi-scale Attributed Node Embedding , 2019, ArXiv.

[31]  Wenwu Zhu,et al.  Deep Learning on Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[32]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[33]  Jure Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[34]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[35]  Christos Faloutsos,et al.  ZooBP: Belief Propagation for Heterogeneous Networks , 2017, Proc. VLDB Endow..

[36]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[37]  Christopher R'e,et al.  Machine Learning on Graphs: A Model and Comprehensive Taxonomy , 2020, ArXiv.

[38]  Kristen M. Altenburger,et al.  Monophily in social networks introduces similarity among friends-of-friends , 2018, Nature Human Behaviour.

[39]  Kevin Chen-Chuan Chang,et al.  Geom-GCN: Geometric Graph Convolutional Networks , 2020, ICLR.

[40]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[41]  Wolfgang Gatterbauer,et al.  Semi-Supervised Learning with Heterophily , 2014, ArXiv.

[42]  Danai Koutra,et al.  On Proximity and Structural Role-based Embeddings in Networks , 2020, ACM Trans. Knowl. Discov. Data.