Size-Invariant Graph Representations for Graph Classification Extrapolations

In general, graph representation learning methods assume that the train and test data come from the same distribution. In this work we consider an underexplored area of an otherwise rapidly developing field of graph representation learning: The task of out-of-distribution (OOD) graph classification, where train and test data have different distributions, with test data unavailable during training. Our work shows it is possible to use a causal model to learn approximately invariant representations that better extrapolate between train and test data. Finally, we conclude with synthetic and real-world dataset experiments showcasing the benefits of representations that are invariant to train/test distribution shifts.

[1]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[2]  Ryoma Sato,et al.  A Survey on The Expressive Power of Graph Neural Networks , 2020, ArXiv.

[3]  John C. S. Lui,et al.  A General Framework for Estimating Graphlet Statistics via Random Walk , 2016, Proc. VLDB Endow..

[4]  Ching-Yao Chuang,et al.  Estimating Generalization under Distribution Shifts via Domain-Invariant Representations , 2020, ICML.

[5]  Eli A. Meirom,et al.  From Local Structures to Size Generalization in Graph Neural Networks , 2020, ICML.

[6]  Jordi Bascompte,et al.  SIMPLE TROPHIC MODULES FOR COMPLEX FOOD WEBS , 2005 .

[7]  Lina Chen,et al.  Identification of breast cancer patients based on human signaling network motifs , 2013, Scientific Reports.

[8]  Charu C. Aggarwal,et al.  Learning Deep Network Representations with Adversarially Regularized Autoencoders , 2018, KDD.

[9]  Pietro Liò,et al.  Principal Neighbourhood Aggregation for Graph Nets , 2020, NeurIPS.

[10]  Alexander D'Amour,et al.  Underspecification Presents Challenges for Credibility in Modern Machine Learning , 2020, J. Mach. Learn. Res..

[11]  Xing Li,et al.  Representation Learning of Graphs Using Graph Convolutional Multilayer Networks Based on Motifs , 2020, Neurocomputing.

[12]  Ryan A. Rossi,et al.  Estimation of local subgraph counts , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[13]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[14]  Dean Eckles,et al.  Design and Analysis of Experiments in Networks: Reducing Bias from Interference , 2014, ArXiv.

[15]  Ruslan Salakhutdinov,et al.  Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text , 2018, EMNLP.

[16]  Bruno Ribeiro,et al.  Graph Pattern Mining and Learning through User-Defined Relations , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[17]  Bernhard Schölkopf,et al.  Causal Feature Selection via Orthogonal Search , 2020, ArXiv.

[18]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[19]  H Vincent Poor,et al.  What network motifs tell us about resilience and reliability of complex networks , 2019, Proceedings of the National Academy of Sciences.

[20]  M. Tweedie Inverse Statistical Variates , 1945, Nature.

[21]  Gilad Yehudai,et al.  On Size Generalization in Graph Neural Networks , 2020, ArXiv.

[22]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[23]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[24]  Ryan A. Rossi,et al.  Higher-order Network Representation Learning , 2018, WWW.

[25]  Yael Artzy-Randrup,et al.  Network motifs and their origins , 2019, PLoS Comput. Biol..

[26]  Pushmeet Kohli,et al.  Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.

[27]  Prateek Mittal,et al.  Rogue Signs: Deceiving Traffic Sign Recognition with Malicious Ads and Logos , 2018, ArXiv.

[28]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Donald F. Towsley,et al.  Diffusion-Convolutional Neural Networks , 2015, NIPS.

[30]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[31]  Judea Pearl,et al.  Probabilistic Evaluation of Counterfactual Queries , 1994, AAAI.

[32]  Donald F. Towsley,et al.  Efficiently Estimating Motif Statistics of Large Networks , 2013, TKDD.

[33]  John C. S. Lui,et al.  Mining Graphlet Counts in Online Social Networks , 2018, ACM Trans. Knowl. Discov. Data.

[34]  Sergey Levine,et al.  Causal Confusion in Imitation Learning , 2019, NeurIPS.

[35]  Vinayak A. Rao,et al.  Relational Pooling for Graph Representations , 2019, ICML.

[36]  Jin Tian,et al.  Causal Discovery from Changes , 2001, UAI.

[37]  Daniel M. Roy,et al.  Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[39]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008 .

[40]  O. Sporns,et al.  Motifs in Brain Networks , 2004, PLoS biology.

[41]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[42]  Bruno Ribeiro,et al.  Subgraph Pattern Neural Networks for High-Order Graph Evolution Prediction , 2018, AAAI.

[43]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[44]  O. Kallenberg Probabilistic Symmetries and Invariance Principles , 2005 .

[45]  Xia Li,et al.  Identifying functions and prognostic biomarkers of network motifs marked by diverse chromatin states in human cell lines , 2019, Oncogene.

[46]  Wenwu Zhu,et al.  Deep Learning on Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[47]  Gianmarco De Francisci Morales,et al.  Link Prediction via Higher-Order Motif Features , 2019, ECML/PKDD.

[48]  Taghi M. Khoshgoftaar,et al.  Survey on categorical data for neural networks , 2020, Journal of Big Data.

[49]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[50]  Wei Ye,et al.  DeepMap: Learning Deep Representations for Graph Classification , 2020, ArXiv.

[51]  Bernhard Schölkopf,et al.  Group invariance principles for causal generative models , 2017, AISTATS.

[52]  Pradeep Ravikumar,et al.  The Risks of Invariant Risk Minimization , 2020, ICLR.

[53]  P. Kelly A congruence theorem for trees. , 1957 .

[54]  Brendan D. McKay,et al.  Small graphs are reconstructible , 1997, Australas. J Comb..

[55]  Pascal Poupart,et al.  Representation Learning for Dynamic Graphs: A Survey , 2020, J. Mach. Learn. Res..

[56]  Jure Leskovec,et al.  Position-aware Graph Neural Networks , 2019, ICML.

[57]  Jure Leskovec,et al.  WILDS: A Benchmark of in-the-Wild Distribution Shifts , 2021, ICML.

[58]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[59]  Ryan A. Rossi,et al.  Heterogeneous Network Motifs , 2019, ArXiv.

[60]  Fred Collopy,et al.  Decomposition by Causal Forces: A Procedure for Forecasting Complex Time Series , 2005 .

[61]  Andreas Loukas,et al.  Building powerful and equivariant graph neural networks with structural message-passing , 2020, NeurIPS.

[62]  I. Guyon,et al.  Causal Generative Neural Networks , 2017, 1711.08936.

[63]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[64]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[65]  Ravi Kumar,et al.  Counting Graphlets: Space vs Time , 2017, WSDM.

[66]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[67]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[68]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[69]  Fred Collopy,et al.  Causal Forces: Structuring Knowledge for Time-Series Extrapolation , 1993 .

[70]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[71]  Douwe Kiela,et al.  Hyperbolic Graph Neural Networks , 2019, NeurIPS.

[72]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[73]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[74]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[75]  D. Aldous Representations for partially exchangeable arrays of random variables , 1981 .

[76]  Stefanie Jegelka,et al.  Generalization and Representational Limits of Graph Neural Networks , 2020, ICML.

[77]  Ryan A. Rossi,et al.  Higher-order Graph Convolutional Networks , 2018, ArXiv.

[78]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[79]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[80]  William L. Hamilton Graph Representation Learning , 2020, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[81]  M. Bálek,et al.  Large Networks and Graph Limits , 2022 .

[82]  Guillaume Lample,et al.  Deep Learning for Symbolic Mathematics , 2019, ICLR.

[83]  Raia Hadsell,et al.  Neural Execution of Graph Algorithms , 2020, ICLR.

[84]  Ali Pinar,et al.  ESCAPE: Efficiently Counting All 5-Vertex Subgraphs , 2016, WWW.

[85]  Robert L. Hemminger,et al.  On reconstructing a graph , 1969 .

[86]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[87]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[88]  Karsten M. Borgwardt,et al.  A Persistent Weisfeiler-Lehman Procedure for Graph Classification , 2019, ICML.

[89]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[90]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[91]  Bruno Ribeiro,et al.  On the Equivalence Between Temporal and Static Graph Representations for Observational Predictions , 2021, ArXiv.

[92]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[93]  Bryan Hooi,et al.  GraphCrop: Subgraph Cropping for Graph Classification , 2020, ArXiv.

[94]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[95]  Yaron Lipman,et al.  Provably Powerful Graph Networks , 2019, NeurIPS.

[96]  D. Freedman,et al.  On the statistics of vision: The Julesz conjecture☆ , 1981 .

[97]  László Lovász,et al.  Limits of dense graph sequences , 2004, J. Comb. Theory B.

[98]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[99]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[100]  Stefanos Zafeiriou,et al.  Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting , 2020, ArXiv.

[101]  Bernhard Scholkopf Causality for Machine Learning , 2019 .

[102]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[103]  Christopher R'e,et al.  Machine Learning on Graphs: A Model and Comprehensive Taxonomy , 2020, ArXiv.

[104]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[105]  S. Mangan,et al.  Structure and function of the feed-forward loop network motif , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[106]  Christian Sohler,et al.  A Property Testing Framework for the Theoretical Expressivity of Graph Kernels , 2018, IJCAI.

[107]  Xavier Bresson,et al.  Learning TSP Requires Rethinking Generalization , 2021, CP.

[108]  E. Munch A User's Guide to Topological Data Analysis , 2017, J. Learn. Anal..

[109]  S. Ulam A collection of mathematical problems , 1960 .

[110]  Jure Leskovec,et al.  Hyperbolic Graph Convolutional Neural Networks , 2019, NeurIPS.

[111]  Joan Bruna,et al.  A Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks , 2017, ArXiv.

[112]  Jure Leskovec,et al.  Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks , 2019, KDD.

[113]  P. Hagmann,et al.  Mapping complex tissue architecture with diffusion spectrum magnetic resonance imaging , 2005, Magnetic resonance in medicine.

[114]  Gary King,et al.  The Dangers of Extreme Counterfactuals , 2006, Political Analysis.

[115]  Ken-ichi Kawarabayashi,et al.  How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks , 2020, ICLR.

[116]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[117]  Lewi Stone,et al.  Competitive exclusion, or species aggregation? , 1992, Oecologia.

[118]  Nils M. Kriege,et al.  A survey on graph kernels , 2019, Applied Network Science.

[119]  Bao-Liang Lu,et al.  Towards Scale-Invariant Graph-related Problem Solving by Iterative Homogeneous GNNs , 2020, NeurIPS.

[120]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[121]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[122]  Bernhard Schölkopf,et al.  Learning Independent Causal Mechanisms , 2017, ICML.

[123]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[124]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[125]  Martin Grohe,et al.  Graph Learning with 1D Convolutions on Random Walks , 2021, ArXiv.

[126]  Karsten M. Borgwardt,et al.  graphkernels: R and Python packages for graph comparison , 2017, Bioinform..

[127]  Joan Bruna,et al.  Can graph neural networks count substructures? , 2020, NeurIPS.

[128]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[129]  Yaron Lipman,et al.  Invariant and Equivariant Graph Networks , 2018, ICLR.

[130]  Alex Arenas,et al.  Mapping Multiplex Hubs in Human Functional Brain Networks , 2016, Front. Neurosci..

[131]  Martin Grohe,et al.  Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks , 2018, AAAI.

[132]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[133]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[134]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[135]  William L. Hamilton,et al.  Inductive Relation Prediction by Subgraph Reasoning , 2020, ICML.

[136]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[137]  Bernhard Schölkopf,et al.  Telling cause from effect in deterministic linear dynamical systems , 2015, ICML.

[138]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[139]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[140]  Peter König,et al.  Data augmentation instead of explicit regularization , 2018, ArXiv.

[141]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[142]  Edoardo M. Airoldi,et al.  Stochastic blockmodel approximation of a graphon: Theory and consistent estimation , 2013, NIPS.

[143]  Oleg Verbitsky,et al.  On Weisfeiler-Leman Invariance: Subgraph Counts and Related Graph Properties , 2018, FCT.

[144]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[145]  S Chandra Mouli,et al.  Neural Networks for Learning Counterfactual G-Invariances from Single Environments , 2021, ICLR.

[146]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[147]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[148]  Kristian Kersting,et al.  TUDataset: A collection of benchmark datasets for learning with graphs , 2020, ArXiv.

[149]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[150]  Patrick Haffner Escaping the Convex Hull with Extrapolated Vector Machines , 2001, NIPS.

[151]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[152]  Aaron C. Courville,et al.  Out-of-Distribution Generalization via Risk Extrapolation , 2020 .

[153]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[154]  P. Thiran,et al.  Mapping Human Whole-Brain Structural Networks with Diffusion MRI , 2007, PloS one.

[155]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.