Graph Classification using Structural Attention

Graph classification is a problem with practical applications in many different domains. To solve this problem, one usually calculates certain graph statistics (i.e., graph features) that help discriminate between graphs of different classes. When calculating such features, most existing approaches process the entire graph. In a graphlet-based approach, for instance, the entire graph is processed to get the total count of different graphlets or subgraphs. In many real-world applications, however, graphs can be noisy with discriminative patterns confined to certain regions in the graph only. In this work, we study the problem of attention-based graph classification. The use of attention allows us to focus on small but informative parts of the graph, avoiding noise in the rest of the graph. We present a novel RNN model, called the Graph Attention Model (GAM), that processes only a portion of the graph by adaptively selecting a sequence of "informative" nodes. Experimental results on multiple real-world datasets show that the proposed method is competitive against various well-known methods in graph classification even though our method is limited to only a portion of the graph.

[1]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[2]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[3]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[4]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[5]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[6]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[7]  Jürgen Schmidhuber,et al.  Learning to forget: continual prediction with LSTM , 1999 .

[8]  Teresa M. Przytycka,et al.  Chapter 5: Network Biology Approach to Complex Diseases , 2012, PLoS Comput. Biol..

[9]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[10]  Philip S. Yu,et al.  Dual active feature and sample selection for graph classification , 2011, KDD.

[11]  Ryan A. Rossi,et al.  Learning Role-based Graph Embeddings , 2018, ArXiv.

[12]  Encoding Rules,et al.  SMILES, a Chemical Language and Information System. 1. Introduction to Methodology , 1988 .

[13]  Jürgen Schmidhuber,et al.  Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.

[14]  Wei Xu,et al.  ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering , 2015, ArXiv.

[15]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[16]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[17]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[18]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[19]  Philip S. Yu,et al.  Identifying Connectivity Patterns for Brain Diseases via Multi-side-view Guided Deep Architectures , 2016, SDM.

[20]  Yanhua Li,et al.  Planning Bike Lanes based on Sharing-Bikes' Trajectories , 2017, KDD.

[21]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[22]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[23]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[24]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[25]  Ryan A. Rossi,et al.  Estimation of Graphlet Statistics , 2017, ArXiv.

[26]  Risi Kondor,et al.  The Multiscale Laplacian Graph Kernel , 2016, NIPS.

[27]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[30]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[31]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[32]  Christos Faloutsos,et al.  Polonium: Tera-Scale Graph Mining for Malware Detection , 2013 .

[33]  Oladimeji Farri,et al.  Condensed Memory Networks for Clinical Diagnostic Inferencing , 2016, AAAI.

[34]  Chengqi Zhang,et al.  Graph Ensemble Boosting for Imbalanced Noisy Graph Stream Classification , 2015, IEEE Transactions on Cybernetics.

[35]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.