WN-Toolkit: Automatic generation of WordNets following the expand model

This paper presents a set of methodologies and algorithms to create WordNets following the expand model. We explore dictionary and BabelNet based strategies, as well as methodologies based on the use of parallel corpora. Evaluation results for six languages are presented: Catalan, Spanish, French, German, Italian and Portuguese. Along with the methodologies and evaluation we present an implementation of all the algorithms grouped in a set of programs or toolkit. These programs have been successfully used in the Know2 Project for the creation of Catalan and Spanish WordNet 3.0. The toolkit is published under the GNU-GPL license and can be freely downloaded from http: //lpg.uoc.edu/wn-toolkit.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[3]  Andreas Eisele,et al.  DGT-TM: A freely available Translation Memory in 22 languages , 2012, LREC.

[4]  Piek T. J. M. Vossen,et al.  Introduction to EuroWordNet , 1998, Comput. Humanit..

[5]  Benoît Sagot,et al.  Building a free French wordnet from multilingual resources , 2008 .

[6]  Sergi Cervell,et al.  Methods and Tools for Building the Catalan WordNet , 1998, ArXiv.

[7]  Dimitar Kazakov,et al.  Retrieving Lexical Semantics from Multilingual Corpora , 2010, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[8]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[9]  Christiane Fellbaum,et al.  English Tasks: All-Words and Verb Lexical Sample , 2001, *SEMEVAL.

[10]  Samuel Reese,et al.  FreeLing 2.1: Five Years of Open-source Language Processing Tools , 2010, LREC.

[11]  Egoitz Laparra,et al.  Multilingual Central Repository version 3.0 , 2012, LREC.

[12]  Francis M. Tyers,et al.  The Apertium machine translation platform: five years on , 2009 .

[13]  Nancy Ide,et al.  Sense Discrimination with Parallel Corpora , 2002, SENSEVAL.

[14]  Simone Paolo Ponzetto,et al.  BabelNet: Building a Very Large Multilingual Semantic Network , 2010, ACL.

[15]  Eneko Agirre,et al.  Semantic Services in FreeLing 2.1: WordNet and UKB , 2010 .

[16]  Horacio Rodríguez,et al.  Combining Multiple Methods for the Automatic Construction of Multilingual WordNets , 1997, ArXiv.

[17]  Valeria de Paiva,et al.  Revisiting a Brazilian WordNet , 2012 .

[18]  Helmut Feldweg,et al.  GermaNet - a Lexical-Semantic Net for German , 1997 .

[19]  Xavier Gómez Guinovart,et al.  Retreading Dictionaries for the 21st Century , 2013, SLATE.

[20]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[21]  Francis Bond,et al.  A Survey of WordNets and their Licenses , 2011 .

[22]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[23]  Antoni Oliver,et al.  Construcción de los WordNets 3.0 para castellano y catalán mediante traducción automática de corpus anotados semánticamente , 2011, Proces. del Leng. Natural.