WoNeF : amélioration, extension et évaluation d'une traduction française automatique de WordNet

WoNeF, an improved, extended and evaluated automatic French translation of WordNet Identifying the various possible meanings of each word of the vocabulary is a difficult problem that requires a lot of manual work. It has been tackled by the WordNet lexical semantics database in English, but there are still few resources available for other languages. Automatic translations of WordNet have been tried to many target languages such as French. JAWS is such an automatic translation of WordNet nouns to French using bilingual dictionaries and a syntactic langage model. We improve the existing translation precision and coverage, complete it with translations of verbs and adjectives and enhance its evaluation method, demonstrating the validity of the approach. In addition to the main result called WoNeF, we produce two additional versions : a high-precision version with 93% precision (up to 97% on nouns) and a high-coverage version with 109,447 (literal, synset) pairs.

[1]  Jordan L. Boyd-Graber,et al.  Adding dense, weighted connections to WordNet , 2005 .

[2]  Gerhard Weikum,et al.  On the Utility of Automatically Generated Wordnets , 2007 .

[3]  Simone Paolo Ponzetto,et al.  BabelNet: Building a Very Large Multilingual Semantic Network , 2010, ACL.

[4]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[5]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[6]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[7]  Benoît Sagot,et al.  Applying cross-lingual WSD to wordnet development , 2012, LREC.

[8]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9]  Ann A. Copestake Lexicalised compositionality , 2011 .

[10]  Gaël de Chalendar,et al.  JAWS : Just Another WordNet Subset , 2010 .

[11]  Claire Mouton Ressources et méthodes semi-supervisées pour l'analyse sémantique de texte en français , 2010 .

[12]  Christiane Fellbaum,et al.  Connecting the Universal to the Specific: Towards the Global Grid , 2007, IWIC.

[13]  Benoît Sagot,et al.  Wordnet extension made simple: A multilingual lexicon-based approach using wiki resources , 2012, LREC.

[14]  H. Joachim Neuhaus,et al.  Ordered profusion : studies in dictionaries and the English lexicon , 1973 .

[15]  Romaric Besançon,et al.  LIMA : A Multilingual Framework for Linguistic Analysis and Linguistic Resources Development and Evaluation , 2010, LREC.

[16]  Benoît Sagot,et al.  Automatic Extension of WOLF , 2012 .

[17]  Gregory Grefenstette,et al.  Conquering Language: Using NLP on a Massive Scale to Build High Dimensional Language Models from the Web , 2009, CICLing.

[18]  Helge Dyvik,et al.  Translations as semantic mirrors: from parallel corpus to wordnet , 2004 .

[19]  Benoît Sagot,et al.  Building a free French wordnet from multilingual resources , 2008 .

[20]  Benoît Sagot,et al.  Wordnet creation and extension made simple: A multilingual lexicon-based approach using wiki resources , 2012, LREC 2012.

[21]  Gerhard Weikum,et al.  Towards a universal wordnet by learning from combined evidence , 2009, CIKM.

[22]  Laura Monceaux,et al.  French EuroWordNet Lexical Database Improvements , 2007, CICLing.

[23]  David M. W. Powers,et al.  The Problem with Kappa , 2012, EACL.

[24]  D. Tufis,et al.  BalkaNet : Aims , Methods , Results and Perspectives . A General Overview , 2004 .

[25]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[26]  Benoît Sagot,et al.  Cleaning noisy wordnets , 2012, LREC.

[27]  Alessandro Lenci,et al.  Identifying hypernyms in distributional semantic spaces , 2012, *SEMEVAL.