Toward a truly multilingual GlobalWordnet Grid

In this paper, we describe a new and improved GlobalWordnet Grid that takes advantage of the Collaborative InterLingual Index (CILI). Currently, the Open Multilingal Wordnet has made many wordnets accessible as a single linked wordnet, but as it used the Princeton Wordnet of English (PWN) as a pivot, it loses concepts that are not part of PWN. The technical solution to this, a central registry of concepts, as proposed in the EuroWordnet project through the InterLingual Index, has been known for many years. However, the practical issues of how to host this index and who decides what goes in remained unsolved. Inspired by current practice in the Semantic Web and the Linked Open Data community, we propose a way to solve this issue. In this paper we define the principles and protocols for contributing to the Grid. We tested them on two use cases, adding version 3.1 of the Princeton WordNet to a CILI based on 3.0 and adding the Open Dutch Wordnet, to validate the current set up. This paper aims to be a call for action that we hope will be further discussed and ultimately taken up by the whole wordnet community.

[1]  Piek Vossen,et al.  Open Dutch WordNet , 2016, GWC.

[2]  Eduard Hovy,et al.  New Trends of Research in Ontologies and Lexical Resources , 2013, Theory and Applications of Natural Language Processing.

[3]  German Rigau,et al.  Multilingual Central Repository version 3 . 0 : upgrading a very large lexical knowledge base , 2011 .

[4]  Francis Bond,et al.  A Survey of WordNets and their Licenses , 2011 .

[5]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[6]  Piek T. J. M. Vossen,et al.  NewsReader: recording history from daily news streams , 2014, LREC.

[7]  Francis Bond,et al.  Annotation of Pronouns in a Multilingual Corpus of Mandarin Chinese , English and Japanese , 2014 .

[8]  Hennie van der Vliet The Referentiebestand Nederlands as a Multi-Purpose Lexical Database , 2007 .

[9]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[10]  Gil Francopoulo,et al.  LMF lexical markup framework , 2013 .

[11]  Hans C. Boas,et al.  Multilingual FrameNets in computational lexicography : methods and applications , 2009 .

[12]  Christiane Fellbaum,et al.  Connecting the Universal to the Specific: Towards the Global Grid , 2007, IWIC.

[13]  John P. McCrae,et al.  CILI: the Collaborative Interlingual Index , 2016, GWC.

[14]  Francis Bond,et al.  OMWEdit - The Integrated Open Multilingual Wordnet Editing System , 2015, ACL.

[15]  Emanuele Pianta,et al.  Revising the Wordnet Domains Hierarchy: semantics, coverage and balancing , 2004 .

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17]  Piek T. J. M. Vossen,et al.  KYOTO: A Knowledge-Rich Approach to the Interoperable Mining of Events from Text , 2013, New Trends of Research in Ontologies and Lexical Resources.

[18]  Steven Bethard,et al.  Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence , 2014, TACL.