Generalization Limits of Graph Neural Networks in Identity Effects Learning

Graph Neural Networks (GNNs) have emerged as a powerful tool for data-driven learning on various graph domains. They are usually based on a message-passing mechanism and have gained increasing popularity for their intuitive formulation, which is closely linked to the Weisfeiler-Lehman (WL) test for graph isomorphism to which they have been proven equivalent in terms of expressive power. In this work, we establish new generalization properties and fundamental limits of GNNs in the context of learning so-called identity effects, i.e., the task of determining whether an object is composed of two identical components or not. Our study is motivated by the need to understand the capabilities of GNNs when performing simple cognitive tasks, with potential applications in computational linguistics and chemistry. We analyze two case studies: (i) two-letters words, for which we show that GNNs trained via stochastic gradient descent are unable to generalize to unseen letters when utilizing orthogonal encodings like one-hot representations; (ii) dicyclic graphs, i.e., graphs composed of two cycles, for which we present positive existence results leveraging the connection between GNNs and the WL test. Our theoretical analysis is supported by an extensive numerical study.

[1]  M. Bianchini,et al.  Graph Neural Networks for temporal graphs: State of the art, open challenges, and opportunities , 2023, ArXiv.

[2]  Floris Geerts,et al.  WL meet VC , 2023, 2301.11039.

[3]  G. Kutyniok,et al.  Generalization Analysis of Message Passing Neural Networks on Large Random Graphs , 2022, NeurIPS.

[4]  James H. Jones,et al.  Review of Graph Neural Network in Text Classification , 2021, 2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[5]  Yu Guang Wang,et al.  Weisfeiler and Lehman Go Cellular: CW Networks , 2021, NeurIPS.

[6]  Michael M. Bronstein,et al.  GRAND: Graph Neural Diffusion , 2021, ICML.

[7]  M. Bianchini,et al.  On the approximation capability of GNNs in node classification/regression tasks , 2021, 2106.08992.

[8]  Franco Scarselli,et al.  Molecular generative Graph Neural Networks for Drug Discovery , 2020, Neurocomputing.

[9]  Guido Mont'ufar,et al.  Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks , 2021, ICML.

[10]  Weiwei Jiang,et al.  Graph Neural Network for Traffic Forecasting: A Survey , 2021, Expert Syst. Appl..

[11]  P. Tupper,et al.  Invariance, Encodings, and Generalization: Learning Identity Effects With Neural Networks , 2021, Neural Computation.

[12]  Thierry Langer,et al.  A compact review of molecular property prediction with graph neural networks. , 2020, Drug discovery today. Technologies.

[13]  Renjie Liao,et al.  A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks , 2020, ICLR.

[14]  Bratislav Misic,et al.  Learning function from structure in neuromorphic networks , 2020, Nature Machine Intelligence.

[15]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[16]  William L. Hamilton Graph Representation Learning , 2020, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[17]  Qing Li,et al.  A Graph Neural Network Framework for Social Recommendations , 2020, IEEE Transactions on Knowledge and Data Engineering.

[18]  Marc Lelarge,et al.  Expressive Power of Invariant and Equivariant Graph Neural Networks , 2020, ICLR.

[19]  Stefanie Jegelka,et al.  Generalization and Representational Limits of Graph Neural Networks , 2020, ICML.

[20]  Ruochi Zhang,et al.  Hyper-SAGNN: a self-attention based graph neural network for hypergraphs , 2019, ICLR.

[21]  Lei Shi,et al.  Skeleton-Based Action Recognition With Directed Graph Neural Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Gabriel Peyré,et al.  Universal Invariant and Equivariant Graph Neural Networks , 2019, NeurIPS.

[23]  Ah Chung Tsoi,et al.  The Vapnik-Chervonenkis dimension of graph and recursive neural networks , 2018, Neural Networks.

[24]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[25]  Sashank J. Reddi,et al.  On the Convergence of Adam and Beyond , 2018, ICLR.

[26]  Ohad Shamir,et al.  Size-Independent Sample Complexity of Neural Networks , 2017, COLT.

[27]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[28]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[29]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[30]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Franco Scarselli,et al.  On the Complexity of Neural Network Classifiers: A Comparison Between Shallow and Deep Architectures , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Gillian Gallagher,et al.  Learning the identity effect as an artificial language: bias and generalisation* , 2013, Phonology.

[34]  Qian Wu,et al.  Natural Language Processing Based Detection of Duplicate Defect Patterns , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.

[35]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[36]  F. Mezzadri How to generate random matrices from the classical compact groups , 2006, math-ph/0609050.

[37]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[38]  Laura Benua Identity Effects in Morphological Truncation , 1995 .

[39]  Yann LeCun,et al.  Measuring the VC-Dimension of a Learning Machine , 1994, Neural Computation.

[40]  László Babai,et al.  Canonical labelling of graphs in linear average time , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[41]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[42]  G. Marcus The Algebraic Mind: Integrating Connectionism and Cognitive Science , 2001 .

[43]  Peter M. Vishton,et al.  Rule learning by seven-month-old infants. , 1999, Science.

[44]  C. Pettinari,et al.  IR and Raman Spectroscopies of Inorganic, Coordination and Organometallic Compounds , 1999 .

[45]  P. Bunker,et al.  Molecular symmetry and spectroscopy , 1979 .