An overview of distance and similarity functions for structured data

The notions of distance and similarity play a key role in many machine learning approaches, and artificial intelligence in general, since they can serve as an organizing principle by which individuals classify objects, form concepts and make generalizations. While distance functions for propositional representations have been thoroughly studied, work on distance functions for structured representations, such as graphs, frames or logical clauses, has been carried out in different communities and is much less understood. Specifically, a significant amount of work that requires the use of a distance or similarity function for structured representations of data usually employs ad-hoc functions for specific applications. Therefore, the goal of this paper is to provide an overview of this work to identify connections between the work carried out in different areas and point out directions for future work.

[1]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[2]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[3]  Kristina Schädler,et al.  Comparing Structures Using a Hopfield-Style Neural Network , 1999, Applied Intelligence.

[4]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[5]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[6]  Jörg Walter Schaaf Fish and Shrink. A Next Step Towards Efficient Case Retrieval in Large-Scale Case Bases , 1996, EWCBR.

[7]  Debbie Leishman Analogy as a Constrained Partial Correspondence Over Conceptual Graphs , 1989, KR.

[8]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[9]  Ryszard S. Michalski,et al.  Inductive inference of VL decision rules , 1977, SGAR.

[10]  Shinji Umeyama,et al.  An Eigendecomposition Approach to Weighted Graph Matching Problems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Michalis Vazirgiannis,et al.  A Degeneracy Framework for Graph Similarity , 2018, IJCAI.

[12]  Bo Hu,et al.  Semantic metrics , 2007, Int. J. Metadata Semant. Ontologies.

[13]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[14]  Robert L. Goldstone,et al.  Relational similarity and the nonindependence of features in similarity judgments , 1991, Cognitive Psychology.

[15]  Bob Carpenter,et al.  The logic of typed feature structures , 1992 .

[16]  Alexander J. Smola,et al.  Fast Kernels for String and Tree Matching , 2002, NIPS.

[17]  N. Ishii,et al.  A method of similarity metrics for structured representations , 1997 .

[18]  Krzysztof Janowicz,et al.  Sim-DL: Towards a Semantic Similarity Measurement Theory for the Description Logic ALCNR in Geographic Information Retrieval , 2006, OTM Workshops.

[19]  Michèle Sebag,et al.  Distance Induction in First Order Logic , 1997, ILP.

[20]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[21]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[22]  Marc Sebban,et al.  Good edit similarity learning by loss minimization , 2012, Machine Learning.

[23]  Stefan Wrobel,et al.  Relational Instance-Based Learning with Lists and Terms , 2001, Machine Learning.

[24]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[25]  Maurice Bruynooghe,et al.  A Framework for Defining Distances Between First-Order Logic Objects , 1998, ILP.

[26]  Kaspar Riesen,et al.  Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..

[27]  Horst Bunke,et al.  Edit distance-based kernel functions for structural pattern classification , 2006, Pattern Recognit..

[28]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[29]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[30]  Thomas Roth-Berghofer,et al.  The development and utilization of the case-based help-desk support system HOMER , 1999 .

[31]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[32]  Gilles Bisson,et al.  Learning in FOL with a Similarity Measure , 1992, AAAI.

[33]  Yijian Xiang,et al.  RetGK: Graph Kernels based on Return Probabilities of Random Walks , 2018, NeurIPS.

[34]  J. A. Campbell,et al.  A Novel Algorithm for Matching Conceptual and Related Graphs , 1995, ICCS.

[35]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[36]  N. Curteanu Book Reviews: Lecture on Contemporary Syntactic Theories: An Introduction to Unification-Based Approaches to Grammar , 1987, CL.

[37]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[38]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[39]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[40]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[41]  BunkeHorst,et al.  Edit distance-based kernel functions for structural pattern classification , 2006 .

[42]  Ralph Bergmann,et al.  Similarity assessment and efficient retrieval of semantic workflows , 2014, Inf. Syst..

[43]  Steffen Staab,et al.  On the Influence of Description Logics Ontologies on Conceptual Similarity , 2008, EKAW.

[44]  Santiago Ontañón,et al.  Measuring Similarity in Description Logics Using Refinement Operators , 2011, ICCBR.

[45]  Philip N. Klein,et al.  Computing the Edit-Distance between Unrooted Ordered Trees , 1998, ESA.

[46]  Shan-Hwei Nienhuys-Cheng,et al.  Completeness and Properness of Refinement Operators in Inductive Logic Programming , 1998, J. Log. Program..

[47]  Thomas Gärtner,et al.  Kernels and Distances for Structured Data , 2004, Machine Learning.

[48]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[49]  Enric Plaza,et al.  Cases as terms: A feature term approach to the structured representation of cases , 1995, ICCBR.

[50]  Dr. Arijit Laha Measuring Similarity , 2016 .

[51]  Gilles Bisson KBG : A Knowledge Based Generalizer , 1990, ML.

[52]  Jichen Zhu,et al.  The SAM Algorithm for Analogy-Based Story Generation , 2011, AIIDE.

[53]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[54]  Ian Horrocks,et al.  Description Logics as Ontology Languages for the Semantic Web , 2005, Mechanizing Mathematical Reasoning.

[55]  Shan-Hwei Nienhuys-Cheng Distance Between Herbrand Interpretations: A Measure for Approximations to a Target Concept , 1997, ILP.

[56]  D. Spielman Algorithms, Graph Theory, and Linear Equations in Laplacian Matrices , 2011 .

[57]  ChenPeter Pin-Shan The entity-relationship modeltoward a unified view of data , 1976 .

[58]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[59]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[60]  Martin Gollery Bioinformatics: Sequence and Genome Analysis, 2nd ed. David W. Mount. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 2004, 692 pp., $75.00, paperback. ISBN 0-87969-712-1. , 2005 .

[61]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[62]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[63]  Horst Bunke,et al.  A Convolution Edit Kernel for Error-tolerant Graph Matching , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[64]  Nagiza F. Samatova,et al.  The Maximum Common Subgraph Problem: Faster Solutions via Vertex Cover , 2007, 2007 IEEE/ACS International Conference on Computer Systems and Applications.

[65]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[66]  Dedre Gentner,et al.  Structure-Mapping: A Theoretical Framework for Analogy , 1983, Cogn. Sci..

[67]  Liviu Badea,et al.  A Refinement Operator for Description Logics , 2000, ILP.

[68]  Joan Serrà,et al.  An empirical evaluation of similarity measures for time series classification , 2014, Knowl. Based Syst..

[69]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[70]  Gabriel Valiente,et al.  A graph distance metric combining maximum common subgraph and minimum common supergraph , 2001, Pattern Recognit. Lett..

[71]  R. Dobrushin Prescribing a System of Random Variables by Conditional Distributions , 1970 .

[72]  Gene H. Golub,et al.  Matrix computations , 1983 .

[73]  R. French The computational modeling of analogy-making , 2002, Trends in Cognitive Sciences.

[74]  Ralph Bergmann,et al.  CASUEL: A Common Case Representation Language , 2001 .

[75]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[76]  H. Bunke Graph Matching : Theoretical Foundations , Algorithms , and Applications , 2022 .

[77]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[78]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[79]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[80]  Paul R. Cohen,et al.  Bayesian Clustering by Dynamics Contents 1 Introduction 1 2 Clustering Markov Chains 2 , 2022 .

[81]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[82]  San Cristóbal Mateo,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996 .

[83]  Kaizhong Zhang,et al.  The editing distance between trees: Algorithms and applications , 1989 .

[84]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[85]  Yannis Kalfoglou,et al.  Ontology mapping: the state of the art , 2003, The Knowledge Engineering Review.

[86]  BaldiPierre,et al.  2005 Speical Issue , 2005 .

[87]  Pedro M. Domingos,et al.  Statistical predicate invention , 2007, ICML '07.

[88]  Pierre-Antoine Champin,et al.  Measuring the Similarity of Labeled Graphs , 2003, ICCBR.

[89]  Tom M. Mitchell,et al.  The Need for Biases in Learning Generalizations , 2007 .

[90]  Jens Lehmann,et al.  A Refinement Operator Based Learning Algorithm for the ALC Description Logic , 2007, ILP.

[91]  Santiago Ontañón,et al.  A Dynamic-Bayesian Network framework for modeling and evaluating learning from observation , 2014, Expert Syst. Appl..

[92]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[93]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[94]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[95]  E. Plaza,et al.  Similarity of Structured Cases in CBR , 2022 .

[96]  Kiyoshi Asai,et al.  Marginalized kernels for biological sequences , 2002, ISMB.

[97]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[98]  Thomas Gärtner,et al.  Kernels for structured data , 2008, Series in Machine Perception and Artificial Intelligence.

[99]  Gilad Mishne,et al.  Source Code Retrieval using Conceptual Similarity , 2004, RIAO.

[100]  Alonzo Church,et al.  A formulation of the simple theory of types , 1940, Journal of Symbolic Logic.

[101]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[102]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[103]  Derek G. Corneil,et al.  The graph isomorphism disease , 1977, J. Graph Theory.

[104]  Horst Bunke,et al.  Automatic learning of cost functions for graph edit distance , 2007, Inf. Sci..

[105]  Santiago Ontañón,et al.  Similarity measures over refinement graphs , 2012, Machine Learning.

[106]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[107]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[108]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[109]  Hisashi Kashima,et al.  Kernels for Semi-Structured Data , 2002, ICML.

[110]  Nicola Fanizzi,et al.  A dissimilarity measure for ALC concept descriptions , 2006, SAC '06.

[111]  Pedro A. González-Calero,et al.  Applying DLs for Retrieval in Case-Based Reasoning , 1999, Description Logics.

[112]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[113]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.

[114]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[115]  Ralph Bergmann,et al.  Representation in case-based reasoning , 2005, The Knowledge Engineering Review.

[116]  G. Levi A note on the derivation of maximal common subgraphs of two directed or undirected graphs , 1973 .

[117]  A. Antunes Democracia e Cidadania na Escola: Do Discurso à Prática , 2008 .

[118]  Robert M. Haralick,et al.  Structural Descriptions and Inexact Matching , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[119]  Pushmeet Kohli,et al.  Graph Matching Networks for Learning the Similarity of Graph Structured Objects , 2019, ICML.

[120]  D. Mount Bioinformatics: Sequence and Genome Analysis , 2001 .

[121]  A. Tversky Features of Similarity , 1977 .

[122]  László Babai,et al.  GROUP, GRAPHS, ALGORITHMS: THE GRAPH ISOMORPHISM PROBLEM , 2019, Proceedings of the International Congress of Mathematicians (ICM 2018).

[123]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[124]  Pierre-François Marteau,et al.  Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[125]  Santiago Ontañón,et al.  The Explanatory Power of Symbolic Similarity in Case-Based Reasoning , 2005, Artificial Intelligence Review.

[126]  Simone Santini,et al.  Similarity Measures , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[127]  Silvana Quaglini,et al.  A knowledge-intensive approach to process similarity calculation , 2015, Expert Syst. Appl..

[128]  Santiago Ontañón,et al.  Refinement-Based Similarity Measures for Directed Labeled Graphs , 2016, ICCBR.

[129]  Vítor Santos Costa,et al.  Inductive Logic Programming , 2013, Lecture Notes in Computer Science.

[130]  Dominique Lenne,et al.  Case Retrieval in Ontology-Based CBR Systems , 2009, KI.

[131]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[132]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[133]  Alexander Borgida,et al.  Towards Measuring Similarity in Description Logics , 2005, Description Logics.

[134]  Alexandre d'Aspremont,et al.  Support vector machine classification with indefinite kernels , 2007, Math. Program. Comput..

[135]  Peter A. Flach,et al.  Comparative Evaluation of Approaches to Propositionalization , 2003, ILP.

[136]  Nicola Fanizzi,et al.  Learning with Kernels in Description Logics , 2008, ILP.

[137]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[138]  Salih O. Duffuaa,et al.  A Linear Programming Approach for the Weighted Graph Matching Problem , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[139]  Ben Taskar,et al.  Probabilistic Entity-Relationship Models, PRMs, and Plate Models , 2007 .

[140]  Jean-Daniel Zucker,et al.  Propositionalization for Clustering Symbolic Relational Descriptions , 2002, ILP.

[141]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[142]  Ralph Bergmann,et al.  Similarity Measures for Object-Oriented Case Representations , 1998, EWCBR.

[143]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[144]  Horst Bunke,et al.  On a relation between graph edit distance and maximum common subgraph , 1997, Pattern Recognit. Lett..

[145]  Nicola Fanizzi,et al.  A Declarative Kernel for ALC Concept Descriptions , 2006, ISMIS.

[146]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[147]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[148]  David Heckerman,et al.  Probabilistic Entity-Relationship Models, PRMs, and Plate Models , 2004 .

[149]  Jens Lehmann,et al.  Ideal Downward Refinement in the EL Description Logic , 2009, ILP.

[150]  Stephen Muggleton,et al.  Support Vector Inductive Logic Programming , 2005, Discovery Science.

[151]  Miro Kraetzl,et al.  Graph distances using graph union , 2001, Pattern Recognit. Lett..

[152]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[153]  Xuelong Li,et al.  A survey of graph edit distance , 2010, Pattern Analysis and Applications.

[154]  Tom M. Mitchell,et al.  Explanation-Based Generalization: A Unifying View , 1986, Machine Learning.

[155]  Brian Falkenhainer,et al.  The Structure-Mapping Engine: Algorithm and Examples , 1989, Artif. Intell..

[156]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[157]  Yongtang Shi,et al.  Fifty years of graph matching, network alignment and network comparison , 2016, Inf. Sci..

[158]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[159]  Toward Automatic Character Identification in Unannotated Narrative Text , 2014 .

[160]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[161]  Santiago Ontañón,et al.  Measuring similarity of individuals in description logics over the refinement space of conjunctive queries , 2015, Journal of Intelligent Information Systems.

[162]  Lei Xu,et al.  A PCA approach for fast retrieval of structural patterns in attributed graphs , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[163]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[164]  Ya N N I S K A L F O G L O U,et al.  Ontology mapping: the state of the art* , 2003 .

[165]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[166]  Alan Hutchinson,et al.  Metrics on Terms and Clauses , 1997, ECML.

[167]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[168]  Bernhard Schölkopf,et al.  A Primer on Kernel Methods , 2004 .

[169]  King-Sun Fu,et al.  Error-Correcting Isomorphisms of Attributed Relational Graphs for Pattern Analysis , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[170]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[171]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[172]  Dietrich Wettschereck,et al.  Relational Instance-Based Learning , 1996, ICML.

[173]  Eva Armengol,et al.  Similarity Assessment for Relational CBR , 2001, ICCBR.

[174]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .

[175]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[176]  Peter P. Chen The entity-relationship model: toward a unified view of data , 1975, VLDB '75.

[177]  K. Holyoak,et al.  Surface and structural similarity in analogical transfer , 1987, Memory & cognition.

[178]  Peter G. Doyle,et al.  Random Walks and Electric Networks: REFERENCES , 1987 .

[179]  Ulrich Schäfer,et al.  Efficient Parameterizable Type Expansion for Typed Feature Formalisms , 1995, IJCAI.

[180]  Martin C. Emele,et al.  Typed Unification Grammars , 1990, COLING.

[181]  Horst Bunke,et al.  Error Correcting Graph Matching: On the Influence of the Underlying Cost Function , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[182]  Steven de Rooij,et al.  Substructure counting graph kernels for machine learning from RDF data , 2015, J. Web Semant..