Efficient Graph Kernels by Randomization

Learning from complex data is becoming increasingly important, and graph kernels have recently evolved into a rapidly developing branch of learning on structured data. However, previously proposed kernels rely on having discrete node label information. In this paper, we explore the power of continuous node-level features for propagation-based graph kernels. Specifically, propagation kernels exploit node label distributions from propagation schemes like label propagation, which naturally enables the construction of graph kernels for partially labeled graphs. In order to efficiently extract graph features from continuous node label distributions, and in general from continuous vector-valued node attributes, we utilize randomized techniques, which easily allow for deriving similarity measures based on propagated information. We show that propagation kernels utilizing locality-sensitive hashing reduce the runtime of existing graph kernels by several orders of magnitude. We evaluate the performance of various propagation kernels on real-world bioinformatics and image benchmark datasets.

[1]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[2]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[3]  Nuno Vasconcelos,et al.  A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[4]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[5]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[6]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[7]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[8]  Kristian Kersting,et al.  Markov Logic Sets: Towards Lifted Information Retrieval Using PageRank and Label Propagation , 2011, AAAI.

[9]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[10]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[11]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[12]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[13]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[14]  Hisashi Kashima,et al.  A Linear-Time Graph Kernel , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[15]  Jean-Philippe Vert,et al.  Graph kernels based on tree patterns for molecules , 2006, Machine Learning.

[16]  Pedro M. Domingos,et al.  Lifted First-Order Belief Propagation , 2008, AAAI.

[17]  Scott Sanner,et al.  Multi-Evidence Lifted Message Passing, with Application to PageRank and the Kalman Filter , 2011, IJCAI.

[18]  Gunnar Rätsch,et al.  A New Discriminative Kernel from Probabilistic Models , 2001, Neural Computation.

[19]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[20]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[21]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[22]  Kristian Kersting,et al.  Counting Belief Propagation , 2009, UAI.

[23]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[24]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[25]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[26]  John D. Lafferty,et al.  Information Diffusion Kernels , 2002, NIPS.