(Hyper)graph Kernels over Simplicial Complexes

Graph kernels are one of the mainstream approaches when dealing with measuring similarity between graphs, especially for pattern recognition and machine learning tasks. In turn, graphs gained a lot of attention due to their modeling capabilities for several real-world phenomena ranging from bioinformatics to social network analysis. However, the attention has been recently moved towards hypergraphs, generalization of plain graphs where multi-way relations (other than pairwise relations) can be considered. In this paper, four (hyper)graph kernels are proposed and their efficiency and effectiveness are compared in a twofold fashion. First, by inferring the simplicial complexes on the top of underlying graphs and by performing a comparison among 18 benchmark datasets against state-of-the-art approaches; second, by facing a real-world case study (i.e., metabolic pathways classification) where input data are natively represented by hypergraphs. With this work, we aim at fostering the extension of graph kernels towards hypergraphs and, more in general, bridging the gap between structural pattern recognition and the domain of hypergraphs.

[1]  Lorenzo Livi,et al.  The graph matching problem , 2012, Pattern Analysis and Applications.

[2]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[3]  J. Moon,et al.  On cliques in graphs , 1965 .

[4]  Ricard V Solé,et al.  When metabolism meets topology: Reconciling metabolite and reaction networks , 2010, BioEssays : news and reviews in molecular, cellular and developmental biology.

[5]  J. Gasteiger,et al.  Chemoinformatics: A Textbook , 2003 .

[6]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[7]  Antonello Rizzi,et al.  Dissimilarity Space Representations and Automatic Feature Selection for Protein Function Prediction , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[8]  Antonello Rizzi,et al.  An Infoveillance System for Detecting and Tracking Relevant Topics From Italian Tweets During the COVID-19 Event , 2020, IEEE Access.

[9]  Natasa Przulj,et al.  Higher‐order molecular organization as a source of biological function , 2018, Bioinform..

[10]  Charles R. Johnson,et al.  Matrix Analysis, 2nd Ed , 2012 .

[11]  Alessandro Giuliani,et al.  Supervised Approaches for Function Prediction of Proteins Contact Networks from Topological Structure Information , 2017, SCIA.

[12]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[13]  S. V. N. Vishwanathan,et al.  Fast Computation of Graph Kernels , 2006, NIPS.

[14]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[15]  Francis R. Bach,et al.  Graph kernels between point clouds , 2007, ICML '08.

[16]  Leo Grady,et al.  Discrete Calculus - Applied Analysis on Graphs for Computational Science , 2010 .

[17]  J. van Leeuwen,et al.  Graph Based Representations in Pattern Recognition , 2003, Lecture Notes in Computer Science.

[18]  Li Yujian,et al.  A Normalized Levenshtein Distance Metric , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Luay Nakhleh,et al.  Properties of metabolic graphs: biological organization or representation artifacts? , 2011, BMC Bioinformatics.

[20]  Akira Tanaka,et al.  The worst-case time complexity for generating all maximal cliques and computational experiments , 2006, Theor. Comput. Sci..

[21]  Antonello Rizzi,et al.  An Ecology-based Index for Text Embedding and Classification , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[22]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[23]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[24]  Sergio Barbarossa,et al.  LEARNING FROM SIGNALS DEFINED OVER SIMPLICIAL COMPLEXES , 2018, 2018 IEEE Data Science Workshop (DSW).

[25]  Alain Bretto,et al.  On the positive semi-definite property of similarity matrices , 2019, Theor. Comput. Sci..

[26]  Horst Bunke,et al.  On a relation between graph edit distance and maximum common subgraph , 1997, Pattern Recognit. Lett..

[27]  Roman Garnett,et al.  Propagation kernels: efficient graph kernels from propagated information , 2015, Machine Learning.

[28]  Mahantapas Kundu,et al.  The journey of graph kernels through two decades , 2018, Comput. Sci. Rev..

[29]  Teresa Gonçalves,et al.  Using Graphs and Semantic Information to Improve Text Classifiers , 2014, PolTAL.

[30]  Antonello Rizzi,et al.  On the Optimization of Embedding Spaces via Information Granulation for Pattern Recognition , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[31]  Michalis Vazirgiannis,et al.  GraKeL: A Graph Kernel Library in Python , 2018, J. Mach. Learn. Res..

[32]  H. Edelsbrunner,et al.  Topological data analysis , 2011 .

[33]  Alessandro Sperduti,et al.  A Tree-Based Kernel for Graphs , 2012, SDM.

[34]  Afra Zomorodian,et al.  Fast construction of the Vietoris-Rips complex , 2010, Comput. Graph..

[35]  Alexander J. Smola,et al.  Learning with non-positive kernels , 2004, ICML.

[36]  Robert P. W. Duin,et al.  The dissimilarity space: Bridging structural and statistical pattern recognition , 2012, Pattern Recognit. Lett..

[37]  Zaïd Harchaoui,et al.  Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  A. Giuliani,et al.  Granular Computing Techniques for Bioinformatics Pattern Recognition Problems in Non-metric Spaces , 2018 .

[39]  Devdatt P. Dubhashi,et al.  Global graph kernels using geometric embeddings , 2014, ICML.

[40]  Horst Bunke,et al.  Edit distance-based kernel functions for structural pattern classification , 2006, Pattern Recognit..

[41]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[42]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[43]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[44]  Horst Bunke,et al.  Non-Euclidean or Non-metric Measures Can Be Informative , 2006, SSPR/SPR.

[45]  Masaru Tomita,et al.  Proteins as networks: usefulness of graph theory in protein science. , 2008, Current protein & peptide science.

[46]  Predrag Radivojac,et al.  Classification in biological networks with hypergraphlet kernels , 2017, Bioinform..

[47]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[48]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[49]  Antonello Rizzi,et al.  A Novel Algorithm for Online Inexact String Matching and its FPGA Implementation , 2017, Cognitive Computation.

[50]  Lorenzo Livi,et al.  Graph ambiguity , 2013, Fuzzy Sets Syst..

[51]  Antonello Rizzi,et al.  Supervised Approaches for Protein Function Prediction by Topological Data Analysis , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[52]  Alessandro Giuliani,et al.  (Hyper)Graph Embedding and Classification via Simplicial Complexes , 2019, Algorithms.

[53]  Alessandro Giuliani,et al.  The Universal Phenotype , 2019 .

[54]  Karsten M. Borgwardt,et al.  Halting in Random Walk Kernels , 2015, NIPS.

[55]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[56]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[57]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[58]  Alessandro Giuliani,et al.  Protein–Protein Interactions: The Structural Foundation of Life Complexity , 2017 .

[59]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[60]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[61]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[62]  Danielle S. Bassett,et al.  Two’s company, three (or more) is a simplex , 2016, Journal of Computational Neuroscience.

[63]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[64]  Alessandro Giuliani,et al.  Modelling and Recognition of Protein Contact Networks by Multiple Kernel Learning and Dissimilarity Representations , 2020, Entropy.

[65]  Sergio Barbarossa,et al.  Topological Signal Processing Over Simplicial Complexes , 2019, IEEE Transactions on Signal Processing.

[66]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[67]  Antonello Rizzi,et al.  Stochastic Information Granules Extraction for Graph Embedding and Classification , 2019, IJCCI.

[68]  P W DuinRobert,et al.  The dissimilarity space , 2012 .

[69]  Rastko R. Selmic,et al.  On the Definiteness of Earth Mover’s Distance and Its Relation to Set Intersection , 2015, IEEE Transactions on Cybernetics.

[70]  Alessandro Giuliani,et al.  Why network approach can promote a new way of thinking in biology , 2014, Front. Genet..

[71]  A. Giuliani,et al.  Protein contact networks: an emerging paradigm in chemistry. , 2013, Chemical reviews.

[72]  James R. Munkres,et al.  Elements of algebraic topology , 1984 .

[73]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[74]  L. Hood,et al.  A Genomic Regulatory Network for Development , 2002, Science.

[75]  Bin Ma,et al.  On the similarity metric and the distance metric , 2009, Theor. Comput. Sci..

[76]  Emad Ramadan,et al.  A hypergraph model for the yeast protein complex network , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[77]  Alessandro Giuliani,et al.  Complexity in Biological Organization: Deconstruction (and Subsequent Restating) of Key Concepts , 2020, Entropy.

[78]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[79]  Gabriel Valiente,et al.  A graph distance metric combining maximum common subgraph and minimum common supergraph , 2001, Pattern Recognit. Lett..

[80]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[81]  A. Rizzi,et al.  Automatic Image Classification by a Granular Computing Approach , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[82]  Hisashi Kashima,et al.  A Linear-Time Graph Kernel , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[83]  Alessandro Giuliani,et al.  Metabolic networks classification and knowledge discovery by information granulation , 2019, Comput. Biol. Chem..

[84]  S. Wuchty Scale-free behavior in protein domain networks. , 2001, Molecular biology and evolution.

[85]  Isabelle Bloch,et al.  Image Classification Using Marginalized Kernels for Graphs , 2007, GbRPR.

[86]  Antonello Rizzi,et al.  Exploiting Cliques for Granular Computing-based Graph Classification , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[87]  Alessandro Giuliani,et al.  Metabolic pathways variability and sequence/networks comparisons , 2006, BMC Bioinformatics.

[88]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[89]  Frédéric Cazals,et al.  A note on the problem of reporting maximal cliques , 2008, Theor. Comput. Sci..

[90]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[91]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[92]  Lorenzo Livi,et al.  On the impact of topological properties of smart grids in power losses optimization problems , 2015, ArXiv.

[93]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[94]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[95]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[96]  Horst Bunke,et al.  Graph-Based Tools for Data Mining and Machine Learning , 2003, MLDM.

[97]  Sergio Barbarossa,et al.  An introduction to hypergraph signal processing , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[98]  Teresa Gonçalves,et al.  Comparison of Different Graph Distance Metrics for Semantic Text Based Classification , 2014, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..