Graph embedding in vector spaces by node attribute statistics

Graph-based representations are of broad use and applicability in pattern recognition. They exhibit, however, a major drawback with regards to the processing tools that are available in their domain. Graph embedding into vector spaces is a growing field among the structural pattern recognition community which aims at providing a feature vector representation for every graph, and thus enables classical statistical learning machinery to be used on graph-based input patterns. In this work, we propose a novel embedding methodology for graphs with continuous node attributes and unattributed edges. The approach presented in this paper is based on statistics of the node labels and the edges between them, based on their similarity to a set of representatives. We specifically deal with an important issue of this methodology, namely, the selection of a suitable set of representatives. In an experimental evaluation, we empirically show the advantages of this novel approach in the context of different classification problems using several databases of graphs.

[1]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[2]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[4]  Miro Kraetzl,et al.  Graph distances using graph union , 2001, Pattern Recognit. Lett..

[5]  Ernest Valveny,et al.  Graph of Words Embedding for Molecular Structure-Activity Relationship Analysis , 2010, CIARP.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  David G. Stork,et al.  Pattern Classification , 1973 .

[8]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[9]  Uchida Seiichi,et al.  Part-based recognition of handwritten digits -- * , 2010 .

[10]  Bernhard Schölkopf,et al.  Dynamic Alignment Kernels , 2000 .

[11]  Ernest Valveny,et al.  Embedding of Graphs with discrete Attributes via Label frequencies , 2013, Int. J. Pattern Recognit. Artif. Intell..

[12]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[13]  Kaspar Riesen,et al.  Graph Classification and Clustering Based on Vector Space Embedding , 2010, Series in Machine Perception and Artificial Intelligence.

[14]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[15]  Kaspar Riesen,et al.  Towards the unification of structural and statistical pattern recognition , 2012, Pattern Recognit. Lett..

[16]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[17]  Charles E. McCulloch,et al.  The EM Algorithm and Its Extensions , 1998 .

[18]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[19]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[20]  Kaspar Riesen,et al.  Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..

[21]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[22]  Thomas Gärtner,et al.  Kernels for structured data , 2008, Series in Machine Perception and Artificial Intelligence.

[23]  C. Watkins Dynamic Alignment Kernels , 1999 .

[24]  Ernest Valveny,et al.  Vocabulary Selection for Graph of Words Embedding , 2011, IbPRIA.

[25]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[26]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[27]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[28]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[29]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[30]  Gabriel Valiente,et al.  A graph distance metric combining maximum common subgraph and minimum common supergraph , 2001, Pattern Recognit. Lett..

[31]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[32]  Edwin R. Hancock,et al.  Pattern Vectors from Algebraic Graph Theory , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Edwin R. Hancock,et al.  Spectral embedding of graphs , 2003, Pattern Recognit..

[34]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[35]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.

[36]  Edwin R. Hancock,et al.  Graph Characterization via Ihara Coefficients , 2011, IEEE Transactions on Neural Networks.

[37]  Ernest Valveny,et al.  Report on the Second Symbol Recognition Contest , 2005, GREC.

[38]  Horst Bunke,et al.  Inexact graph matching for structural pattern recognition , 1983, Pattern Recognit. Lett..

[39]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[40]  Luc De Raedt,et al.  Feature Construction with Version Spaces for Biochemical Applications , 2001, ICML.

[41]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[42]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[43]  John D. Lafferty,et al.  Information Diffusion Kernels , 2002, NIPS.

[44]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.