A Structural Smoothing Framework For Robust Graph Comparison

In this paper, we propose a general smoothing framework for graph kernels by taking structural similarity into account, and apply it to derive smoothed variants of popular graph kernels. Our framework is inspired by state-of-the-art smoothing techniques used in natural language processing (NLP). However, unlike NLP applications that primarily deal with strings, we show how one can apply smoothing to a richer class of inter-dependent sub-structures that naturally arise in graphs. Moreover, we discuss extensions of the Pitman-Yor process that can be adapted to smooth structured objects, thereby leading to novel graph kernels. Our kernels are able to tackle the diagonal dominance problem while respecting the structural similarity between features. Experimental evaluation shows that not only our kernels achieve statistically significant improvements over the unsmoothed variants, but also outperform several other graph kernels in the literature. Our kernels are competitive in terms of runtime, and offer a viable option for practitioners.

[1]  Alessandro Moschitti,et al.  Fast support vector machines for convolution tree kernels , 2012, Data Mining and Knowledge Discovery.

[2]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[3]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[4]  Ashwin Srinivasan,et al.  Statistical Evaluation of the Predictive Toxicology Challenge 2000-2001 , 2003, Bioinform..

[5]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[6]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[7]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[9]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[10]  Thomas L. Griffiths,et al.  Producing Power-Law Distributions and Damping Word Frequencies with Two-Stage Language Models , 2011, J. Mach. Learn. Res..

[11]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[12]  B. Schölkopf,et al.  Edinburgh Research Explorer Interpolating between types and tokens by estimating power-law generators , 2006 .

[13]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[14]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[15]  Roberto Basili,et al.  Structured Lexical Similarity via Convolution Kernels on Dependency Trees , 2011, EMNLP.

[16]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[17]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[18]  John Shawe-Taylor,et al.  Reducing Kernel Matrix Diagonal Dominance Using Semi-definite Programming , 2003, COLT.

[19]  Hannu Toivonen,et al.  Statistical evaluation of the predictive toxicology challenge , 2000 .

[20]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[21]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[22]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[23]  Marleen de Bruijne,et al.  Scalable kernels for graphs with continuous attributes , 2013, NIPS.

[24]  Marion Neumann,et al.  Propagation Kernels for Partially Labeled Graphs , 2012 .

[25]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[26]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[27]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[28]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[29]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[30]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[31]  B. McKay nauty User ’ s Guide ( Version 2 . 4 ) , 1990 .