Learning to Pool in Graph Neural Networks for Extrapolation

Graph neural networks (GNNs) are one of the most popular approaches to using deep learning on graph-structured data, and they have shown state-of-the-art performances on a variety of tasks. However, according to a recent study, a careful choice of pooling functions, which are used for the aggregation or readout operation in GNNs, is crucial for enabling GNNs to extrapolate. Without the ideal combination of pooling functions, which varies across tasks, GNNs completely fail to generalize to out-of-distribution data, while the number of possible combinations grows exponentially with the number of layers. In this paper, we present GNP, a L norm-like pooling function that is trainable end-to-end for any given task. Notably, GNP generalizes most of the widely-used pooling functions. We verify experimentally that simply replacing all pooling functions with GNP enables GNNs to extrapolate well on many node-level, graph-level, and set-related tasks; and GNP sometimes performs even better than optimal combinations of existing pooling functions.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Kristian Kersting,et al.  TUDataset: A collection of benchmark datasets for learning with graphs , 2020, ArXiv.

[3]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[6]  T. Lindvall ON A ROUTING PROBLEM , 2004, Probability in the Engineering and Informational Sciences.

[7]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[8]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[9]  Quoc V. Le,et al.  Chip Placement with Deep Reinforcement Learning , 2020, ArXiv.

[10]  Da Xu,et al.  Inductive Representation Learning on Temporal Graphs , 2020, ICLR.

[11]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[12]  David L. Dill,et al.  Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.

[13]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[14]  Luís C. Lamb,et al.  Learning to Solve NP-Complete Problems - A Graph Neural Network for the Decision TSP , 2018, AAAI.

[15]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[16]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[17]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[18]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[19]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[20]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[21]  Raia Hadsell,et al.  Neural Execution of Graph Algorithms , 2020, ICLR.

[22]  Yee Whye Teh,et al.  Set Transformer , 2018, ICML.

[23]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[24]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[25]  Jaewoo Kang,et al.  Self-Attention Graph Pooling , 2019, ICML.

[26]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[27]  Kijung Shin,et al.  MONSTOR: An Inductive Approach for Estimating and Maximizing Influence over Unseen Networks , 2020, 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[28]  Tao Qi,et al.  Attentive Pooling with Learnable Norms for Text Representation , 2020, ACL.

[29]  Shuiwang Ji,et al.  Graph U-Nets , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Yaron Lipman,et al.  Provably Powerful Graph Networks , 2019, NeurIPS.

[31]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[32]  Shuiwang Ji,et al.  StructPool: Structured Graph Pooling via Conditional Random Fields , 2020, ICLR.

[33]  Bernard Ghanem,et al.  DeeperGCN: All You Need to Train Deeper GCNs , 2020, ArXiv.

[34]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[35]  Chris Dyer,et al.  Neural Arithmetic Logic Units , 2018, NeurIPS.

[36]  Partha Pratim Talukdar,et al.  ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations , 2020, AAAI.

[37]  Lingfan Yu,et al.  Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. , 2019 .

[38]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[39]  Ken-ichi Kawarabayashi,et al.  How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks , 2020, ICLR.

[40]  Zhuowen Tu,et al.  Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree , 2015, AISTATS.

[41]  Ruosong Wang,et al.  Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels , 2019, NeurIPS.

[42]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[43]  Steve Renals,et al.  Differentiable Pooling for Unsupervised Acoustic Model Adaptation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[44]  俊一 甘利 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .

[45]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[46]  Razvan Pascanu,et al.  Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks , 2013, ECML/PKDD.