Multilingual Ontology Matching based on Wiktionary Data Accessible via SPARQL Endpoint

Interoperability is a feature required by the Semantic Web. It is provided by the ontology matching methods and algorithms. But now ontologies are presented not only in English, but in other languages as well. It is important to use an automatic translation for obtaining correct matching pairs in multilingual ontology matching. The translation into many languages could be based on the Google Translate API, the Wiktionary database, etc. From the point of view of the balance of presence of many languages, of manually crafted translations, of a huge size of a dictionary, the most promising resource is the Wiktionary. It is a collaborative project working on the same principles as the Wikipedia. The parser of the Wiktionary was developed and the machine-readable dictionary was designed. The data of the machine-readable Wiktionary are stored in a relational database, but with the help of D2R server the database is presented as an RDF store. Thus, it is possible to get lexicographic information (definitions, translations, synonyms) from web service using SPARQL requests. In the case study, the problem entity is a task of multilingual ontology matching based on Wiktionary data accessible via SPARQL endpoint. Ontology matching results obtained using Wiktionary were compared with results based on Google Translate API.

[1]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[2]  Oren Etzioni,et al.  Panlingual lexical translation via probabilistic inference , 2010, Artif. Intell..

[4]  Marc Ehrig,et al.  Specification of a benchmarking methodology for alignment techniques , 2004 .

[5]  Tanja Schultz,et al.  Automatic Pronunciation Dictionary Generation from Wiktionary and Wikipedia , 2009 .

[6]  A. A. Krizhanovsky The comparison of Wiktionary thesauri transformed into the machine-readable format , 2010, ArXiv.

[7]  Kurt Sandkuhl,et al.  A New Expanding Tree Ontology Matching Method , 2007, OTM Workshops.

[8]  Francis M. Tyers,et al.  Rapid rule-based machine translation between Dutch and Afrikaans , 2011, EAMT.

[9]  Declan O'Sullivan,et al.  Cross-Lingual Ontology Mapping and Its Use on the Multilingual Semantic Web , 2010, MSW.

[10]  Marcelo Arenas,et al.  Querying semantic web data with SPARQL , 2011, PODS.

[11]  Pablo de la Fuente,et al.  An Empirical Study of Real-World SPARQL Queries , 2011, ArXiv.

[12]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[13]  Kenneth D. Forbus,et al.  NULEX: An Open-License Broad Coverage Lexicon , 2011, ACL.

[14]  Alexandre Passant,et al.  Semantic search on heterogeneous Wiki systems , 2010, Int. Sym. Wikis.

[15]  Asunción Gómez-Pérez,et al.  Enriching an Ontology with Multilingual Information , 2008, ESWC.

[16]  William E. Winkler,et al.  The State of Record Linkage and Current Research Problems , 1999 .

[17]  Renata Vieira,et al.  An API for Multi-lingual Ontology Matching , 2010, LREC.

[18]  Claus Zinn,et al.  A Web-Based Repository Service for Vocabularies and Alignments in the Cultural Heritage Domain , 2010, ESWC.

[19]  A. A. Krizhanovsky Transformation of Wiktionary entry structure into tables and relations in a relational database schema , 2010, ArXiv.

[20]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[21]  Zachary Kurmas Zawilinski: a library for studying grammar in Wiktionary , 2010, Int. Sym. Wikis.

[22]  Kurt Sandkuhl,et al.  A Survey of Exploiting WordNet in Ontology Matching , 2008, IFIP AI.

[23]  Kurt Sandkuhl,et al.  Context-based Ontology Matching: Concept and Application Cases , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.