Mixup for Node and Graph Classification

Mixup is an advanced data augmentation method for training neural network based image classifiers, which interpolates both features and labels of a pair of images to produce synthetic samples. However, devising the Mixup methods for graph learning is challenging due to the irregularity and connectivity of graph data. In this paper, we propose the Mixup methods for two fundamental tasks in graph learning: node and graph classification. To interpolate the irregular graph topology, we propose the two-branch graph convolution to mix the receptive field subgraphs for the paired nodes. Mixup on different node pairs can interfere with the mixed features for each other due to the connectivity between nodes. To block this interference, we propose the two-stage Mixup framework, which uses each node’s neighbors’ representations before Mixup for graph convolutions. For graph classification, we interpolate complex and diverse graphs in the semantic space. Qualitatively, our Mixup methods enable GNNs to learn more discriminative features and reduce over-fitting. Quantitative results show that our method yields consistent gains in terms of test accuracy and F1-micro scores on standard datasets, for both node and graph classification. Overall, our method effectively regularizes popular graph neural networks for better generalization without increasing their time complexity.

[1]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[2]  Jason Weston,et al.  Vicinal Risk Minimization , 2000, NIPS.

[3]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[4]  Stephan Günnemann,et al.  Predict then Propagate: Graph Neural Networks meet Personalized PageRank , 2018, ICLR.

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[7]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[8]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[9]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[10]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[11]  Tingyang Xu,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2020, ICLR.

[12]  Beng Chin Ooi,et al.  Detecting Implementation Bugs in Graph Convolutional Network based Node Classifiers , 2020, 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE).

[13]  Hao Ma,et al.  GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs , 2018, UAI.

[14]  Bryan Hooi,et al.  GraphCrop: Subgraph Cropping for Graph Classification , 2020, ArXiv.

[15]  Davide Bacciu,et al.  A Fair Comparison of Graph Neural Networks for Graph Classification , 2020, ICLR.

[16]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[17]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[18]  Nadia Magnenat-Thalmann,et al.  Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Yoshua Bengio,et al.  GMNN: Graph Markov Neural Networks , 2019, ICML.

[20]  Jaewoo Kang,et al.  Self-Attention Graph Pooling , 2019, ICML.

[21]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Kensuke Yokoi,et al.  APAC: Augmented PAttern Classification with Neural Networks , 2015, ArXiv.

[23]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[24]  Ioannis Mitliagkas,et al.  Manifold Mixup: Better Representations by Interpolating Hidden States , 2018, ICML.

[25]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[26]  Tian Ye,et al.  Graph Star Net for Generalized Multi-Task Learning , 2019, ArXiv.

[27]  Bryan Hooi,et al.  NodeAug: Semi-Supervised Node Classification with Data Augmentation , 2020, KDD.

[28]  Luis Perez,et al.  The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[29]  Samy Bengio,et al.  Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks , 2019, KDD.

[30]  Hongyu Guo,et al.  Augmenting Data with Mixup for Sentence Classification: An Empirical Study , 2019, ArXiv.

[31]  Richard J. Cleary Handbook of Beta Distribution and Its Applications , 2006 .

[32]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[33]  Jure Leskovec,et al.  Image Labeling on a Network: Using Social-Network Metadata for Image Classification , 2012, ECCV.

[34]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[35]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[36]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[37]  Zhengyang Wang,et al.  Large-Scale Learnable Graph Convolutional Networks , 2018, KDD.

[38]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[39]  Charu C. Aggarwal,et al.  Graph Convolutional Networks with EigenPooling , 2019, KDD.

[40]  Progressive Supervision for Node Classification , 2020, ECML/PKDD.

[41]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[42]  Roman Garnett,et al.  Propagation kernels: efficient graph kernels from propagated information , 2015, Machine Learning.

[43]  Mohamed R. Amer,et al.  Understanding Attention and Generalization in Graph Neural Networks , 2019, NeurIPS.

[44]  Rajgopal Kannan,et al.  GraphSAINT: Graph Sampling Based Inductive Learning Method , 2019, ICLR.

[45]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[46]  Arjun K. Gupta,et al.  Handbook of beta distribution and its applications , 2004 .

[47]  Yong Jae Lee,et al.  Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond , 2018, ArXiv.

[48]  C. Faloutsos,et al.  Provably Robust Node Classification via Low-Pass Message Passing , 2020, 2020 IEEE International Conference on Data Mining (ICDM).

[49]  Shuiwang Ji,et al.  Graph U-Nets , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[51]  David S. Rosenblum,et al.  Directed Graph Convolutional Network , 2020, ArXiv.

[52]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[53]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[54]  Yizhou Sun,et al.  Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification , 2019, ArXiv.

[55]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[56]  Parsa Moradi,et al.  Memory-Based Graph Networks , 2020, ICLR.

[57]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[59]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[60]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[61]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[62]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[63]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[64]  Bryan Hooi,et al.  Graph Neural Network-Based Anomaly Detection in Multivariate Time Series , 2021, AAAI.

[65]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[66]  Christos Faloutsos,et al.  MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams , 2019, AAAI.

[67]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[68]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[69]  Qiang Ma,et al.  Dual Graph Convolutional Networks for Graph-Based Semi-Supervised Classification , 2018, WWW.

[70]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[71]  Shan Liu,et al.  AttPool: Towards Hierarchical Feature Representation in Graph Convolutional Networks via Attention Mechanism , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).