GraphCrop: Subgraph Cropping for Graph Classification

We present a new method to regularize graph neural networks (GNNs) for better generalization in graph classification. Observing that the omission of sub-structures does not necessarily change the class label of the whole graph, we develop the \textbf{GraphCrop} (Subgraph Cropping) data augmentation method to simulate the real-world noise of sub-structure omission. In principle, GraphCrop utilizes a node-centric strategy to crop a contiguous subgraph from the original graph while maintaining its connectivity. By preserving the valid structure contexts for graph classification, we encourage GNNs to understand the content of graph structures in a global sense, rather than rely on a few key nodes or edges, which may not always be present. GraphCrop is parameter learning free and easy to implement within existing GNN-based graph classifiers. Qualitatively, GraphCrop expands the existing training set by generating novel and informative augmented graphs, which retain the original graph labels in most cases. Quantitatively, GraphCrop yields significant and consistent gains on multiple standard datasets, and thus enhances the popular GNNs to outperform the baseline methods.

[1]  Bryan Hooi,et al.  NodeAug: Semi-Supervised Node Classification with Data Augmentation , 2020, KDD.

[2]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[4]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[5]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[6]  Davide Bacciu,et al.  A Fair Comparison of Graph Neural Networks for Graph Classification , 2020, ICLR.

[7]  Stephan Günnemann,et al.  Diffusion Improves Graph Learning , 2019, NeurIPS.

[8]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[9]  Tingyang Xu,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2020, ICLR.

[10]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[11]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[12]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[13]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[14]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[15]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[16]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[17]  Yong Jae Lee,et al.  Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond , 2018, ArXiv.

[18]  François Fouss,et al.  An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification , 2012, Neural Networks.

[19]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[20]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[21]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[22]  Nils M. Kriege,et al.  A survey on graph kernels , 2019, Applied Network Science.

[23]  Viktor K. Prasanna,et al.  Accurate, Efficient and Scalable Graph Embedding , 2018, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[24]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[25]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[26]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[27]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[28]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[29]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[30]  Jian Tang,et al.  InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization , 2019, ICLR.

[31]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[32]  Luis Perez,et al.  The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[33]  Roman Garnett,et al.  Propagation kernels: efficient graph kernels from propagated information , 2015, Machine Learning.

[34]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[35]  Kaveh Hassani,et al.  Contrastive Multi-View Representation Learning on Graphs , 2020, ICML.

[36]  Michalis Vazirgiannis,et al.  Graph Kernels: A Survey , 2019, J. Artif. Intell. Res..

[37]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[38]  Charu C. Aggarwal,et al.  Graph Convolutional Networks with EigenPooling , 2019, KDD.

[39]  Donald B. Johnson,et al.  A Note on Dijkstra's Shortest Path Algorithm , 1973, JACM.

[40]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.