Grouping Synonyms by Definitions

We present a method for grouping the synonyms of a lemma according to its dictionary senses. The senses are defined by a large machine read- able dictionary for French, the TLFi (Tresor de la langue francaise informatise) and the syn- onyms are given by 5 synonym dictionaries (also for French). To evaluate the proposed method, we manually constructed a gold standard where for each (word, definition) pair and given the set of synonyms defined for that word by the 5 synonym dictionaries, 4 lexicographers speci- fied the set of synonyms they judge adequate. While inter-annotator agreement ranges on that task from 67% to at best 88% depending on the annotator pair and on the synonym dictionary being considered, the automatic procedure we propose scores a precision of 67% and a recall of 71%. The proposed method is compared with related work namely, word sense disambiguation, synonym lexicon acquisition and WordNet con- struction.

[1]  Ted Pedersen,et al.  Unsupervised Corpus-Based Methods for WSD , 2007 .

[2]  Ming Zhou,et al.  Optimizing Synonym Extraction Using Monolingual and Bilingual Resources , 2003, IWP@ACL.

[3]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[4]  D. Bourigault,et al.  Approche linguistique pour l'analyse syntaxique de corpus , 2000 .

[5]  Emile Genouvrier,et al.  Nouveau dictionnaire des synonymes , 1977 .

[6]  S. Griffis EDITOR , 1997, Journal of Navigation.

[7]  Benoît Sagot,et al.  Building a free French wordnet from multilingual resources , 2008 .

[8]  Vincent D. Blondel,et al.  Automatic extraction of synonyms in a dictionary , 2002 .

[9]  Laura Monceaux,et al.  French EuroWordNet Lexical Database Improvements , 2007, CICLing.

[10]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[11]  Regina Barzilay,et al.  Extracting Paraphrases from a Parallel Corpus , 2001, ACL.

[12]  Carolyn J. Crouch,et al.  Experiments in automatic statistical thesaurus construction , 1992, SIGIR '92.

[13]  Pierre Benjamin Lafaye Dictionnaire Des Synonymes De La Langue Française , 2010 .

[14]  P. Robert,et al.  Dictionnaire alphabétique et analogique de la langue française . Société de Nouveau Littré , 1970 .

[15]  L. Guilbert,et al.  Grand Larousse de la langue française , 1971 .

[16]  Eneko Agirre,et al.  Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm , 2006 .

[17]  Jean-Marie Pierrel,et al.  Le trésor de la Langue Française informatisé. Un exemple d'informatisation d'un dictionnaire de langue de référence , 2003 .

[18]  Jean-Luc Manguin,et al.  Le Dictionnaire Électronique des Synonymes du CRISCO : un mode d'emploi à trois niveaux , 2004 .

[19]  Jos de Bruijn,et al.  D4.2.1 State-of-the-art survey on Ontology Merging and Aligning V1 , 2004 .

[20]  Benoît Sagot,et al.  Combining Multiple Resources to Build Reliable Wordnets , 2008, TSD.

[21]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[22]  German Rigau,et al.  Supervised Corpus-Based Methods for WSD , 2007 .

[23]  Eneko Agirre,et al.  Word Sense Disambiguation: Algorithms and Applications , 2007 .