Recurrent subgraph prediction

Interactions in dynamic networks often transcend the dyadic barrier and emerge as subgraphs. The evolution of these subgraphs cannot be completely predicted using a pairwise link prediction analysis. We propose a novel solution to the problem - "Prediction of Recurrent Subgraphs (PReSub)" which treats subgraphs as individual entities in their own right. PReSub predicts re-occurring subgraphs using the network's vector space embedding and a set of "early warning subgraphs" which act as global and local descriptors of the subgraph's behavior. PReSub can be used as an out-of-the-box pipeline method with user-provided subgraphs or even to discover interesting subgraphs in an unsupervised manner. It can handle missing network information and is parallelizable. We show that PReSub outperforms traditional pairwise link prediction for a variety of evolving network datasets. The goal of this framework is to improve our understanding of subgraphs and provide an alternative representation in order to characterize their behavior.

[1]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[2]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[3]  Jafar Adibi,et al.  The Enron Email Dataset Database Schema and Brief Statistical Report , 2004 .

[4]  Kaspar Riesen,et al.  Graph Classification Based on Vector Space Embedding , 2009, Int. J. Pattern Recognit. Artif. Intell..

[5]  Nitesh V. Chawla,et al.  Vertex collocation profiles: subgraph counting for link analysis and prediction , 2012, WWW.

[6]  Kaspar Riesen,et al.  Graph Embedding in Vector Spaces by Means of Prototype Selection , 2007, GbRPR.

[7]  Kaspar Riesen,et al.  Speeding Up Graph Edit Distance Computation through Fast Bipartite Matching , 2011, GbRPR.

[8]  Kaspar Riesen,et al.  Fast Suboptimal Algorithms for the Computation of Graph Edit Distance , 2006, SSPR/SPR.

[9]  Frans Coenen,et al.  A survey of frequent subgraph mining algorithms , 2012, The Knowledge Engineering Review.

[10]  Nitesh V. Chawla,et al.  LPmade: Link Prediction Made Easy , 2011, J. Mach. Learn. Res..

[11]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[12]  Jure Leskovec,et al.  Patterns of Influence in a Recommendation Network , 2006, PAKDD.

[13]  Vladimir Vacic,et al.  Graphlet Kernels for Prediction of Functional Residues in Protein Structures , 2010, J. Comput. Biol..

[14]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[15]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[16]  Jon M. Kleinberg,et al.  Subgraph frequencies: mapping the empirical and extremal geography of large graph collections , 2013, WWW.

[17]  Daniel Jackoway Wherefore Art Thou R 3579 X ? Anonymized Social Networks , Hidden Patterns , and Structural , 2014 .

[18]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[19]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[20]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[21]  Lawrence B. Holder,et al.  Substructure Discovery Using Minimum Description Length and Background Knowledge , 1993, J. Artif. Intell. Res..

[22]  Alexandr Andoni,et al.  Approximating Edit Distance in Near-Linear Time , 2012, SIAM J. Comput..

[23]  Kaspar Riesen,et al.  Reducing the dimensionality of dissimilarity space embedding graph kernels , 2009, Eng. Appl. Artif. Intell..

[24]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[25]  Tanya Y. Berger-Wolf,et al.  Structure Prediction in Temporal Networks using Frequent Subgraphs , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.