Applying Linked Data Principles to Linking Multilingual Wordnets

Wordnets are the most widely used lexical resources in natural language processing (NLP). There exist wordnets in more than 40 languages by now and all of these are connected to the original Princeton WordNet. The origins of linguistic linked data (LD) can thus in some sense be traced to the WordNet project. The implementation of the linking, however, has not relied on stable identifiers and has thus led to technical problems of reference when new versions of a wordnet are released. This chapter describes how linked data principles have been applied in the development of the Global WordNet Grid (GWG), an attempt to form a catalogue of interlingual contexts that extends beyond the Anglo-Saxon roots of the Princeton WordNet. We will describe in particular how LD technologies have been used in realizing a Collaborative Interlingual Index (CILI) that builds on standard LD vocabularies and the resource description framework (RDF) data model. We finally describe a method to link wordnets to external resources such as DBpedia/Wikipedia.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Paolo Rosso,et al.  On the evaluation and improvement of Arabic WordNet coverage and usability , 2013, Language Resources and Evaluation.

[3]  Menzo Windhouwer,et al.  RELISH LMF: Unlocking the Full Power of the Lexical Markup Framework , 2014, LREC.

[4]  Piek Vossen Introduction to EuroWordNet , 1998 .

[5]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[6]  Marc Kemps-Snijders,et al.  ISOcat: Corralling Data Categories in the Wild , 2008, LREC.

[7]  Paul Buitelaar,et al.  Linking Datasets Using Semantic Textual Similarity , 2018 .

[8]  David Lindemann,et al.  Bilingual Dictionary Drafting:: Bootstrapping WordNet and BabelNet , 2017 .

[9]  Andrea Esuli,et al.  SentiWordNet: A High-Coverage Lexical Resource for Opinion Mining , 2006 .

[10]  Piek Vossen,et al.  Open Dutch WordNet , 2016, GWC.

[11]  Christiane Fellbaum,et al.  Publishing and Linking WordNet using lemon and RDF , 2014 .

[12]  Aldo Gangemi,et al.  Conversion of WordNet to a standard RDF/OWL representation , 2006, LREC.

[13]  Nicola Guarino,et al.  Some Ontological Principles for Designing Upper Level Lexical Resources , 1998, LREC.

[14]  Francis Bond,et al.  Linking and Extending an Open Multilingual Wordnet , 2013, ACL.

[15]  Philipp Cimiano,et al.  Integrating WordNet and Wiktionary with lemon , 2012, Linked Data in Linguistics.

[16]  Christian Chiarcos,et al.  lemonUby - A large, interlinked, syntactically-rich lexical resource for ontologies , 2015, Semantic Web.

[17]  Iryna Gurevych,et al.  UBY - A Large-Scale Unified Lexical-Semantic Resource Based on LMF , 2012, EACL.

[18]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[19]  Hitoshi Isahara,et al.  Development of the Japanese WordNet , 2008, LREC.

[20]  Lars Trap-Jensen,et al.  DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary , 2009, Lang. Resour. Evaluation.

[21]  Asunción Gómez-Pérez,et al.  Interchanging lexical resources on the Semantic Web , 2012, Language Resources and Evaluation.

[22]  Philipp Cimiano,et al.  Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0 , 2014, LREC.

[23]  Benoît Sagot,et al.  Building a free French wordnet from multilingual resources , 2008 .

[24]  Piek Vossen,et al.  Wordnet-LMF: fleshing out a standardized format for wordnet interoperability , 2009, IWIC '09.

[25]  Jia-Fei Hong,et al.  中文词汇网络:跨语言知识处理基础架构的设计理念与实践 = Chinese wordnet : design, implementation, and application of an infrastructure for cross-lingual knowledge processing , 2010 .

[26]  Emanuele Pianta,et al.  Revising the Wordnet Domains Hierarchy: semantics, coverage and balancing , 2004 .

[27]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[28]  Božo Bekavac,et al.  Building Croatian WordNet , 2008 .

[29]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[30]  Eneko Agirre,et al.  Methodology and construction of the Basque WordNet , 2011, Lang. Resour. Evaluation.

[31]  Markus Forsberg,et al.  SALDO: a touch of yin to WordNet’s yang , 2013, Lang. Resour. Evaluation.

[32]  Claudia Soria,et al.  Lexical Markup Framework (LMF) , 2006, LREC.

[33]  Stan Szpakowicz,et al.  plWordNet 3.0 – a Comprehensive Lexical-Semantic Resource , 2016, COLING.

[34]  Hitoshi Isahara,et al.  Thai WordNet Construction , 2009, ALR7@IJCNLP.

[35]  Heshaam Faili,et al.  Automatic Persian WordNet Construction , 2010, COLING.

[36]  John P. McCrae,et al.  Toward a truly multilingual GlobalWordnet Grid , 2016, GWC.

[37]  Menzo Windhouwer,et al.  Linking to Linguistic Data Categories in ISOcat , 2012, Linked Data in Linguistics.

[38]  Christiane Fellbaum,et al.  Building a WordNet for Arabic , 2006, LREC.

[39]  Iryna Gurevych,et al.  UBY-LMF - A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF , 2012, LREC.

[40]  Simone Paolo Ponzetto,et al.  BabelNet: Building a Very Large Multilingual Semantic Network , 2010, ACL.

[41]  Francis Bond,et al.  Creating the Open Wordnet Bahasa , 2011, PACLIC.

[42]  Francis Bond,et al.  Building the Chinese Open Wordnet (COW): Starting from Core Synsets , 2013 .

[43]  Kiril Ivanov Simov,et al.  Constructing of an Ontology-based Lexicon for Bulgarian , 2010, LREC.