Neural Subgraph Isomorphism Counting

In this paper, we study a new graph learning problem: learning to count subgraph isomorphisms. Different from other traditional graph learning problems such as node classification and link prediction, subgraph isomorphism counting is NP-complete and requires more global inference to oversee the whole graph. To make it scalable for large-scale graphs and patterns, we propose a learning framework that augments different representation learning architectures and iteratively attends pattern and target data graphs to memorize intermediate states of subgraph isomorphism searching for global counting. We develop both small graphs (<= 1,024 subgraph isomorphisms in each) and large graphs (<= 4,096 subgraph isomorphisms in each) sets to evaluate different representation and interaction modules. A mutagenic compound dataset, MUTAG, is also used to evaluate neural models and demonstrate the success of transfer learning. While the learning based approach is inexact, we are able to generalize to count large patterns and data graphs in linear time compared to the exponential time of the original NP-complete problem. Experimental results show that learning based subgraph isomorphism counting can speed up the traditional algorithm, VF2, 10-1,000 times with acceptable errors. Domain adaptation based on fine-tuning also shows the usefulness of our approach in real-world applications.

[1]  Jeong-Hoon Lee,et al.  Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases , 2013, SIGMOD '13.

[2]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[3]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[4]  Kevin Chen-Chuan Chang,et al.  Semantic proximity search on graphs with metagraph-based learning , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[5]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[6]  Joan Bruna,et al.  On the equivalence between graph isomorphism testing and function approximation with GNNs , 2019, NeurIPS.

[7]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[8]  Igor Jurisica,et al.  Efficient estimation of graphlet frequency distributions in protein-protein interaction networks , 2006, Bioinform..

[9]  Yizhou Sun,et al.  Semi-supervised Learning over Heterogeneous Information Networks by Ensemble of Meta-graph Guided Random Walks , 2017, IJCAI.

[10]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[11]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[12]  Dana Ron,et al.  Approximately Counting Triangles in Sublinear Time , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[13]  Yizhou Sun,et al.  Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach , 2019, CIKM.

[14]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[15]  Bruno Ribeiro,et al.  Graph Pattern Mining and Learning through User-Defined Relations , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[16]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[17]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[18]  Jianzhong Li,et al.  Graph pattern matching , 2010, Proc. VLDB Endow..

[19]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[22]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[23]  Dik Lun Lee,et al.  Meta-Graph Based Recommendation Fusion over Heterogeneous Information Networks , 2017, KDD.

[24]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[25]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[26]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[27]  Noga Alon,et al.  Biomolecular network motif counting and discovery by color coding , 2008, ISMB.

[28]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[29]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[30]  Jianzhong Li,et al.  Efficient Subgraph Matching on Billion Node Graphs , 2012, Proc. VLDB Endow..

[31]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[32]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[33]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[34]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[37]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[38]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[39]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[40]  Alessia Saggese,et al.  Challenging the Time Complexity of Exact Subgraph Isomorphism for Huge and Dense Graphs with VF3 , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[42]  Tianyu Wo,et al.  Strong simulation , 2014, ACM Trans. Database Syst..

[43]  Donald F. Towsley,et al.  Efficiently Estimating Motif Statistics of Large Networks , 2013, TKDD.

[44]  Ken-ichi Kawarabayashi,et al.  What Can Neural Networks Reason About? , 2019, ICLR.

[45]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[46]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[47]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[48]  Karsten M. Borgwardt,et al.  Wasserstein Weisfeiler-Lehman Graph Kernels , 2019, NeurIPS.

[49]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[50]  Charalampos E. Tsourakakis Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[51]  Ambuj K. Singh,et al.  Graphs-at-a-time: query language and access methods for graph databases , 2008, SIGMOD Conference.

[52]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[53]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[55]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[56]  George Karypis,et al.  GREW - a scalable frequent subgraph discovery algorithm , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[57]  Ah Chung Tsoi,et al.  Graph neural networks for ranking Web pages , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[58]  Xiang Li,et al.  Meta Structure: Computing Relevance in Large Heterogeneous Information Networks , 2016, KDD.

[59]  Ming Zhou,et al.  Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[60]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .