Transformer and Snowball Graph Convolution Learning for Biomedical Graph Classification

Graph or network has been widely used for describing and modeling complex systems in biomedicine. Deep learning methods, especially graph neural networks (GNNs), have been developed to learn and predict with such structured data. In this paper, we proposed a novel transformer and snowball encoding networks (TSEN) for biomedical graph classification, which introduced transformer architecture with graph snowball connection into GNNs for learning whole-graph representation. TSEN combined graph snowball connection with graph transformer by snowball encoding layers, which enhanced the power to capture multi-scale information and global patterns to learn the whole-graph features. On the other hand, TSEN also used snowball graph convolution as position embedding in transformer structure, which was a simple yet effective method for capturing local patterns naturally. Results of experiments using four graph classification datasets demonstrated that TSEN outperformed the state-of-the-art typical GNN models and the graph-transformer based GNN models.

[1]  Michelle M. Li,et al.  Graph representation learning in biomedicine and healthcare , 2022, Nature Biomedical Engineering.

[2]  Karsten M. Borgwardt,et al.  Structure-Aware Transformer for Graph Representation Learning , 2022, ICML.

[3]  C. Kwoh,et al.  Graph representation learning in bioinformatics: trends, methods and applications , 2021, Briefings Bioinform..

[4]  A. Sperduti,et al.  Multiresolution Reservoir Graph Neural Network , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Christopher Morris Graph Neural Networks: Graph Classification , 2022, Graph Neural Networks: Foundations, Frontiers, and Applications.

[6]  Azalia Mirhoseini,et al.  Representing Long-Range Context for Graph Neural Networks with Global Attention , 2022, NeurIPS.

[7]  A. Sperduti,et al.  Polynomial-based graph convolutional neural networks for graph classification , 2021, Machine-mediated learning.

[8]  Jinlong Hu,et al.  GAT-LI: a graph attention network based learning and interpreting method for functional brain network classification , 2021, BMC Bioinformatics.

[9]  Di He,et al.  Do Transformers Really Perform Bad for Graph Representation? , 2021, ArXiv.

[10]  Dominique Beaini,et al.  Rethinking Graph Transformers with Spectral Attention , 2021, NeurIPS.

[11]  Hyung Won Chung,et al.  Do Transformer Modifications Transfer Across Implementations and Applications? , 2021, EMNLP.

[12]  Chenglin Li,et al.  Multi-Scale Graph Convolutional Network With Spectral Graph Wavelet Frame , 2021, IEEE Transactions on Signal and Information Processing over Networks.

[13]  Xavier Bresson,et al.  A Generalization of Transformer Networks to Graphs , 2020, ArXiv.

[14]  Samuel Kaski,et al.  Rethinking pooling in graph neural networks , 2020, NeurIPS.

[15]  Shuiwang Ji,et al.  Towards Deeper Graph Neural Networks , 2020, KDD.

[16]  Yaliang Li,et al.  Simple and Deep Graph Convolutional Networks , 2020, ICML.

[17]  Yatao Bian,et al.  Self-Supervised Graph Transformer on Large-Scale Molecular Data , 2020, NeurIPS.

[18]  Jiawei Zhang,et al.  Graph-Bert: Only Attention is Needed for Learning Graph Representations , 2020, ArXiv.

[19]  Deng Cai,et al.  Graph Transformer for Graph-to-Sequence Learning , 2019, AAAI.

[20]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[21]  Petra Mutzel,et al.  Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings , 2019, NeurIPS.

[22]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Doina Precup,et al.  Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks , 2019, NeurIPS.

[24]  Omer Levy,et al.  Are Sixteen Heads Really Better than One? , 2019, NeurIPS.

[25]  Geoffrey E. Hinton,et al.  Similarity of Neural Network Representations Revisited , 2019, ICML.

[26]  Jaewoo Kang,et al.  Self-Attention Graph Pooling , 2019, ICML.

[27]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  J. Qiu,et al.  Reduced default mode network functional connectivity in patients with recurrent major depressive disorder , 2019, Proceedings of the National Academy of Sciences.

[29]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[30]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[31]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[32]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[34]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[35]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[36]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[37]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[39]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[40]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[41]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[43]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[44]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Daniel P. Kennedy,et al.  The Autism Brain Imaging Data Exchange: Towards Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism , 2013, Molecular Psychiatry.

[46]  Khundrakpam Budhachandra,et al.  The Neuro Bureau Preprocessing Initiative: open sharing of preprocessed neuroimaging data and derivatives , 2013 .

[47]  R Cameron Craddock,et al.  A whole brain fMRI atlas generated via spatially constrained spectral clustering , 2012, Human brain mapping.

[48]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[49]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[50]  Jeffrey J. Sutherland,et al.  Spline-Fitting with a Genetic Algorithm: A Method for Developing Classification Structure-Activity Relationships , 2003, J. Chem. Inf. Comput. Sci..