DRew: Dynamically Rewired Message Passing with Delay

Message passing neural networks (MPNNs) have been shown to suffer from the phenomenon of over-squashing that causes poor performance for tasks relying on long-range interactions. This can be largely attributed to message passing only occurring locally, over a node's immediate neighbours. Rewiring approaches attempting to make graphs 'more connected', and supposedly better suited to long-range tasks, often lose the inductive bias provided by distance on the graph since they make distant nodes communicate instantly at every layer. In this paper we propose a framework, applicable to any MPNN architecture, that performs a layer-dependent rewiring to ensure gradual densification of the graph. We also propose a delay mechanism that permits skip connections between nodes depending on the layer and their mutual distance. We validate our approach on several long-range tasks and show that it outperforms graph Transformers and multi-hop MPNNs.

[1]  Yusu Wang,et al.  Understanding Oversquashing in GNNs through the Lens of Effective Resistance , 2023, ICML.

[2]  Francesco Di Giovanni,et al.  On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology , 2023, ICML.

[3]  Petar Velivckovi'c,et al.  Expander Graph Propagation , 2022, LoG.

[4]  Michael W. Mahoney,et al.  Gradient Gating for Deep Multi-Rate Learning on Graphs , 2022, ICLR.

[5]  Vijay Prakash Dwivedi,et al.  Long Range Graph Benchmark , 2022, NeurIPS.

[6]  Nuria Oliver,et al.  DiffWire: Inductive Graph Rewiring via the Lovász Bound , 2022, LoG.

[7]  .Ismail .Ilkan Ceylan,et al.  Shortest Path Networks for Graph Property Prediction , 2022, LoG.

[8]  Vijay Prakash Dwivedi,et al.  Recipe for a General, Powerful, Scalable Graph Transformer , 2022, NeurIPS.

[9]  Roger Wattenhofer,et al.  Asynchronous Neural Networks for Learning in Graphs , 2022, ArXiv.

[10]  Haoteng Yin,et al.  Equivariant and Stable Positional Encoding for More Powerful Graph Neural Networks , 2022, ICLR.

[11]  Francesco Di Giovanni,et al.  Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs , 2022, NeurIPS.

[12]  Rickard Brüel Gabrielsson,et al.  Rewiring with Positional Encodings for Graph Neural Networks , 2022, ArXiv.

[13]  Francesco Di Giovanni,et al.  Understanding over-squashing and bottlenecks on graphs via curvature , 2021, ICLR.

[14]  Roger Wattenhofer,et al.  DropGNN: Random Dropouts Increase the Expressiveness of Graph Neural Networks , 2021, NeurIPS.

[15]  Muhan Zhang,et al.  Nested Graph Neural Networks , 2021, NeurIPS.

[16]  Vijay Prakash Dwivedi,et al.  Graph Neural Networks with Learnable Structural and Positional Representations , 2021, ICLR.

[17]  Yu Guang Wang,et al.  Weisfeiler and Lehman Go Cellular: CW Networks , 2021, NeurIPS.

[18]  Di He,et al.  Do Transformers Really Perform Bad for Graph Representation? , 2021, ArXiv.

[19]  Dominique Beaini,et al.  Rethinking Graph Transformers with Spectral Attention , 2021, NeurIPS.

[20]  Jure Leskovec,et al.  OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs , 2021, NeurIPS Datasets and Benchmarks.

[21]  Yao Guo,et al.  Neural Delay Differential Equations , 2021, ICLR.

[22]  P. K. Srijith,et al.  Delay Differential Neural Networks , 2020, ICMLT.

[23]  Chen Cai,et al.  A Note on Over-Smoothing for Graph Neural Networks , 2020, ArXiv.

[24]  Eran Yahav,et al.  On the Bottleneck of Graph Neural Networks and its Practical Implications , 2020, ICLR.

[25]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[26]  Pablo Barceló,et al.  Logical Expressiveness of Graph Neural Networks , 2019 .

[27]  Jaewoo Kang,et al.  Graph Transformer Networks , 2019, NeurIPS.

[28]  Stephan Günnemann,et al.  Diffusion Improves Graph Learning , 2019, NeurIPS.

[29]  Junzhou Huang,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2019, ICLR.

[30]  Michalis Vazirgiannis,et al.  k-hop Graph Neural Networks , 2019, Neural Networks.

[31]  Marc Brockschmidt,et al.  GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation , 2019, ICML.

[32]  Taiji Suzuki,et al.  Graph Neural Networks Exponentially Lose Expressive Power for Node Classification , 2019, ICLR.

[33]  Takanori Maehara,et al.  Revisiting Graph Neural Networks: All We Have is Low-Pass Filters , 2019, ArXiv.

[34]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[35]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[36]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[37]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[38]  Joonseok Lee,et al.  N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification , 2018, UAI.

[39]  Xavier Bresson,et al.  Residual Gated Graph ConvNets , 2017, ArXiv.

[40]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[41]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[42]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[43]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[44]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[45]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[46]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[47]  Andrew Richards,et al.  University of Oxford Advanced Research Computing , 2015 .

[48]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[49]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[50]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[51]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[53]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[54]  Alessandro Sperduti,et al.  Encoding Labeled Graphs by Labeling RAAM , 1993, NIPS.

[55]  Yoshua Bengio,et al.  Benchmarking Graph Neural Networks , 2023, J. Mach. Learn. Res..

[56]  Francesco Di Giovanni,et al.  Graph Neural Networks as Gradient Flows , 2022, ArXiv.

[57]  A. A. LEMAN,et al.  THE REDUCTION OF A GRAPH TO CANONICAL FORM AND THE ALGEBRA WHICH APPEARS THEREIN , 2018 .

[58]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.