Information theoretic graph kernels

This thesis addresses the problems that arise in state-of-the-art structural learning methods for (hyper)graph classification or clustering, particularly focusing on developing novel information theoretic kernels for graphs. To this end, we commence in Chapter 3 by defining a family of Jensen-Shannon diffusion kernels, i.e., the information theoretic kernels, for (un)attributed graphs. We show that our kernels overcome the shortcomings of inefficiency (for the unattributed diffusion kernel) and discarding un-isomorphic substructures (for the attributed diffusion kernel) that arise in the R-convolution kernels. In Chapter 4, we present a novel framework of computing depth-based complexity traces rooted at the centroid vertices for graphs, which can be efficiently computed for graphs with large sizes. We show that our methods can characterize a graph in a higher dimensional complexity feature space than state-of-the-art complexity measures. In Chapter 5, we develop a novel unattributed graph kernel by matching the depth-based substructures in graphs, based on the contribution in Chapter 4. Unlike most existing graph kernels in the literature which merely enumerate similar substructure pairs of limited sizes, our method incorporates explicit local substructure correspondence into the process of kernelization. The new kernel thus overcomes the shortcoming of neglecting structural correspondence that arises in most state-of-the-art graph kernels. The novel methods developed in Chapters 3, 4, and 5 are only restricted to graphs. However, real-world data usually tends to be represented by higher order relationships (i.e., hypergraphs). To overcome the shortcoming, in Chapter 6 we present a new hypergraph kernel using substructure isomorphism tests. We show that our kernel limits tottering that arises in the existing walk and subtree based (hyper)graph kernels. In Chapter 7, we summarize the contributions of this thesis. Furthermore, we analyze the proposed methods. Finally, we give some suggestions for the future work.

[1]  Donald B. Johnson,et al.  Efficient Algorithms for Shortest Paths in Sparse Networks , 1977, J. ACM.

[2]  Lawrence B. Holder,et al.  Mining Graph Data , 2006 .

[3]  Edwin R. Hancock,et al.  Graph characteristics from the heat kernel trace , 2009, Pattern Recognit..

[4]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[5]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[6]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[7]  Edwin R. Hancock,et al.  Graph matching through entropic manifold alignment , 2011, CVPR 2011.

[8]  Kaspar Riesen,et al.  Improving vector space embedding of graphs through feature selection algorithms , 2011, Pattern Recognit..

[9]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  Andrew M. Childs,et al.  Universal computation by quantum walk. , 2008, Physical review letters.

[12]  Jean Ponce,et al.  A tensor-based algorithm for high-order graph matching , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  S. Severini,et al.  The Laplacian of a Graph as a Density Matrix: A Basic Combinatorial Approach to Separability of Mixed States , 2004, quant-ph/0406165.

[14]  Stasys Jukna,et al.  On Graph Complexity , 2006, Combinatorics, Probability and Computing.

[15]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[16]  Venu Madhav Govindu,et al.  A tensor decomposition for geometric grouping and segmentation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Edwin R. Hancock,et al.  Graph characterizations from von Neumann entropy , 2012, Pattern Recognit. Lett..

[18]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[19]  Tamir Hazan,et al.  Multi-way Clustering Using Super-Symmetric Non-negative Tensor Factorization , 2006, ECCV.

[20]  Matthias Hein,et al.  Hilbertian Metrics and Positive Definite Kernels on Probability Measures , 2005, AISTATS.

[21]  John D. Lafferty,et al.  Diffusion Kernels on Statistical Manifolds , 2005, J. Mach. Learn. Res..

[22]  H. C. Longuet-Higgins,et al.  An algorithm for associating the features of two images , 1991, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Fabrizio Costa,et al.  Fast Neighborhood Subgraph Pairwise Distance Kernel , 2010, ICML.

[25]  Edwin R. Hancock,et al.  Graph Characterization via Ihara Coefficients , 2011, IEEE Transactions on Neural Networks.

[26]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[27]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[28]  Edwin R. Hancock,et al.  A Quantum Jensen-Shannon Graph Kernel Using the Continuous-Time Quantum Walk , 2013, GbRPR.

[29]  Matthias Dehmer,et al.  A history of graph entropy measures , 2011, Inf. Sci..

[30]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[31]  Francis R. Bach,et al.  Graph kernels between point clouds , 2007, ICML '08.

[32]  Amnon Shashua,et al.  Linear image coding for regression and classification using the tensor-rank principle , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[33]  Horst Bunke,et al.  Bridging the Gap between Graph Edit Distance and Kernel Machines , 2007, Series in Machine Perception and Artificial Intelligence.

[34]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[35]  Javier Bustos-Jiménez,et al.  Metrics and Models for Social Networks , 2012 .

[36]  Susan M. Bridges,et al.  Prediction of Cell Penetrating Peptides by Support Vector Machines , 2011, PLoS Comput. Biol..

[37]  Francisco Escolano,et al.  Heat diffusion: thermodynamic depth complexity of networks. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Simone Severini,et al.  Quantifying Complexity in Networks: The von Neumann Entropy , 2009, Int. J. Agent Technol. Syst..

[39]  M. V. Valkenburg Network Analysis , 1964 .

[40]  Kaspar Riesen,et al.  Graph Classification and Clustering Based on Vector Space Embedding , 2010, Series in Machine Perception and Artificial Intelligence.

[41]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[42]  Zaïd Harchaoui,et al.  Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Horst Bunke,et al.  A Unified Framework for Strengthening Topological Node Features and Its Application to Subgraph Isomorphism Detection , 2013, GbRPR.

[44]  Kyuwan Choi,et al.  Detecting the Number of Clusters in n-Way Probabilistic Clustering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Edwin R. Hancock,et al.  A quantum Jensen-Shannon graph kernel for unattributed graphs , 2015, Pattern Recognit..

[46]  Vincent Barra,et al.  3D shape retrieval using Kernels on Extended Reeb Graphs , 2013, Pattern Recognit..

[47]  Peter J. Slater,et al.  Centers to centroids in graphs , 1978, J. Graph Theory.

[48]  A. Lynn Abbott,et al.  Diffusion on Statistical Manifolds , 2006, 2006 International Conference on Image Processing.

[49]  N. Rashevsky Life, information theory, and topology , 1955 .

[50]  Edwin R. Hancock,et al.  Backtrackless Walks on a Graph , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Edwin R. Hancock,et al.  Graph Kernels from the Jensen-Shannon Divergence , 2012, Journal of Mathematical Imaging and Vision.

[52]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[53]  Edwin R. Hancock,et al.  Attributed Graph Kernels Using the Jensen-Tsallis q-Differences , 2014, ECML/PKDD.

[54]  Claude Berge,et al.  Hypergraphs - combinatorics of finite sets , 1989, North-Holland mathematical library.

[55]  Amnon Shashua,et al.  Probabilistic graph and hypergraph matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Gordon F. Royle,et al.  Algebraic Graph Theory , 2001, Graduate texts in mathematics.

[57]  Danail Bonchev,et al.  Complexity Analysis of Yeast Proteome Network , 2004, Chemistry & biodiversity.

[58]  Kenji Fukumizu,et al.  Semigroup Kernels on Measures , 2005, J. Mach. Learn. Res..

[59]  Eric P. Xing,et al.  Nonextensive Information Theoretic Kernels on Measures , 2009, J. Mach. Learn. Res..

[60]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.

[61]  Roni Khardon,et al.  Learning from interpretations: a rooted kernel for ordered hypergraphs , 2007, ICML '07.

[62]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[63]  A. Plastino,et al.  Metric character of the quantum Jensen-Shannon divergence , 2008, 0801.1586.

[64]  Romain Boulet DISJOINT UNIONS OF COMPLETE GRAPHS CHARACTERIZED BY THEIR LAPLACIAN SPECTRUM , 2009 .

[65]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[66]  Edwin R. Hancock,et al.  A polynomial characterization of hypergraphs using the Ihara zeta function , 2011, Pattern Recognit..

[67]  C. Jordan Sur les assemblages de lignes. , 1869 .

[68]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[69]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[70]  G. Bianconi,et al.  Shannon and von Neumann entropy of random networks with heterogeneous expected degree. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[71]  Nils M. Kriege,et al.  Subgraph Matching Kernels for Attributed Graphs , 2012, ICML.

[72]  Guilhelm Savin CENTROIDS: A DECENTRALIZED APPROACH , 2011 .

[73]  Yosi Keller,et al.  Efficient High Order Matching , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[75]  A. Mowshowitz,et al.  Entropy and the complexity of graphs. I. An index of the relative complexity of a graph. , 1968, The Bulletin of mathematical biophysics.

[76]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[77]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[78]  Guangliang Chen,et al.  Kernel Spectral Curvature Clustering (KSCC) , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[79]  Marcello Pelillo,et al.  A Game-Theoretic Approach to Hypergraph Clustering , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[80]  R. Karl Rethemeyer,et al.  Network analysis , 2011 .

[81]  Guangliang Chen,et al.  Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling , 2008, Found. Comput. Math..

[82]  Matthias Dehmer,et al.  Entropy and the Complexity of Graphs Revisited , 2012, Entropy.

[83]  J. Crutchfield,et al.  Measures of statistical complexity: Why? , 1998 .

[84]  Edwin R. Hancock,et al.  Steady State Random Walks for Path Estimation , 2004, SSPR/SPR.

[85]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[86]  Silvia Biasotti,et al.  3D Shape Matching through Topological Structures , 2003, DGCI.

[87]  P. W. Lamberti,et al.  Jensen-Shannon divergence as a measure of distinguishability between mixed quantum states , 2005, quant-ph/0508138.

[88]  Edwin R. Hancock,et al.  Pattern Vectors from Algebraic Graph Theory , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[89]  Jean-Philippe Vert,et al.  Semigroup Kernels on Finite Sets , 2004, NIPS.

[90]  Edwin R. Hancock,et al.  Quantum walks, Ihara zeta functions and cospectrality in regular graphs , 2011, Quantum Inf. Process..

[91]  J. Crutchfield,et al.  Thermodynamic depth of causal states: Objective complexity via minimal representations , 1999 .

[92]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[93]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[94]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[95]  Matthias Dehmer,et al.  Information processing in complex networks: Graph entropy and information functionals , 2008, Appl. Math. Comput..

[96]  D. Bonchev,et al.  Complexity in chemistry, biology, and ecology , 2005 .

[97]  Jan Havrda,et al.  Quantification method of classification processes. Concept of structural a-entropy , 1967, Kybernetika.

[98]  Frank Nielsen,et al.  Fitting the Smallest Enclosing Bregman Ball , 2005, ECML.

[99]  A. P. Santhakumaran,et al.  Center of a graph with respect to edges , 2010 .

[100]  C. Tsallis,et al.  Nonextensive Entropy: Interdisciplinary Applications , 2004 .

[101]  E. Trucco A note on the information content of graphs , 1956 .

[102]  P. Dirac Principles of Quantum Mechanics , 1982 .

[103]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[104]  Cesar H. Comin,et al.  Entropy and Heterogeneity Measures for Directed Graphs , 2013, SIMBAD.

[105]  Edwin R. Hancock,et al.  Clustering and Embedding Using Commute Times , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[106]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[107]  Pietro Perona,et al.  Beyond pairwise clustering , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).