Link Representation and Discovery

In this chapter we address the question of how links can be discovered between different datasets published as Linguistic Linked Open Data. We describe common patterns to represent links both between data that are on the same language (monolingual scenario) and between data in different languages (cross-lingual scenario). Further, we describe techniques that can be used to automatically discover links between datasets. As most of these techniques rely on computing similarities between data elements, we briefly review the most common techniques for computing syntactic and semantic similarity. Finally, we provide a brief overview of tools and frameworks that can be used to semi-automatically discover links between language resources.

[1]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[2]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[3]  Asunción Gómez-Pérez,et al.  Cross-lingual Linking on the Multilingual Web of Data , 2012, MSW.

[4]  Philipp Cimiano,et al.  A Machine Learning Approach to Multilingual and Cross-Lingual Ontology Matching , 2011, SEMWEB.

[5]  Philipp Cimiano,et al.  An Experimental Comparison of Explicit Semantic Analysis Implementations for Cross-Language Retrieval , 2009, NLDB.

[6]  Philipp Cimiano,et al.  Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching , 2013, EMNLP.

[7]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[8]  Markus Nentwig,et al.  A survey of current Link Discovery frameworks , 2016, Semantic Web.

[9]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[10]  Heiko Paulheim,et al.  Adoption of the Linked Data Best Practices in Different Topical Domains , 2014, SEMWEB.

[11]  Kurt Sandkuhl,et al.  A Survey of Exploiting WordNet in Ontology Matching , 2008, IFIP AI.

[12]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[13]  M. Fréchet Sur quelques points du calcul fonctionnel , 1906 .

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Paul Buitelaar,et al.  Non-Orthogonal Explicit Semantic Analysis , 2015, *SEMEVAL.

[16]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[17]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[18]  Bernardo Cuenca Grau,et al.  LogMap: Logic-Based and Scalable Ontology Matching , 2011, SEMWEB.

[19]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[20]  Eneko Agirre,et al.  Word Sense Disambiguation using Conceptual Density , 1996, COLING.

[21]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[22]  Asunción Gómez-Pérez,et al.  Challenges for the multilingual Web of Data , 2012, J. Web Semant..

[23]  Kartik Asooja,et al.  Monolingual and cross-lingual ontology matching with CIDER-CL: evaluation report for OAEI 2013 , 2013, OM.

[24]  Peter Christen,et al.  Data Matching , 2012, Data-Centric Systems and Applications.

[25]  Dimitris Kontokostas,et al.  Multilingual linked data patterns , 2015, Semantic Web.

[26]  François Scharffe,et al.  Data Linking for the Semantic Web , 2011, Int. J. Semantic Web Inf. Syst..

[27]  Philipp Cimiano,et al.  Cross-language Information Retrieval with Explicit Semantic Analysis , 2008, CLEF.

[28]  Paul Buitelaar,et al.  Linking Datasets Using Semantic Textual Similarity , 2018 .

[29]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[30]  Eneko Agirre,et al.  SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.

[31]  Declan O'Sullivan,et al.  Cross-Lingual Ontology Mapping - An Investigation of the Impact of Machine Translation , 2009, ASWC.

[32]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[33]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.