Automatic Discovery of Fuzzy Synsets from Dictionary Definitions

In order to deal with ambiguity in natural language, it is common to organise words, according to their senses, in synsets, which are groups of synonymous words that can be seen as concepts. The manual creation of a broad-coverage synset base is a time-consuming task, so we take advantage of dictionary definitions for extracting synonymy pairs and clustering for identifying synsets. Since word senses are not discrete, we create fuzzy synsets, where each word has a membership degree. We report on the results of the creation of a fuzzy synset base for Portuguese, from three electronic dictionaries. The resulting resource is larger than existing hancrafted Portuguese thesauri.

[1]  Eugénio C. Oliveira,et al.  Comparing Verb Synonym Resources for Portuguese , 2010, PROPOR.

[2]  Hugo Gonçalo Oliveira,et al.  Onto.PT: Automatic Construction of a Lexical Ontology for Portuguese , 2010, STAIRS.

[3]  Nuno Seco,et al.  Noun Sense Disambiguation with WordNet for Software Design Retrieval , 2003, Canadian Conference on AI.

[4]  S. Griffis EDITOR , 1997, Journal of Navigation.

[5]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[6]  José João Almeida,et al.  Processing and Extracting Data from Dicionário Aberto , 2010, LREC.

[7]  Adam Kilgarriff,et al.  "I Don’t Believe in Word Senses" , 1997, Comput. Humanit..

[8]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[9]  Erick Galani Maziero,et al.  A base de dados lexical e a interface web do TeP 2.0: thesaurus eletrônico para o Português do Brasil , 2008, WebMedia.

[10]  William B. Dolan,et al.  Word Sense Ambiguation: Clustering Related Senses , 1994, COLING.

[11]  Jean-Cédric Chappelier,et al.  Synonym Dictionary Improvement through Markov Clustering and Clustering Stability , 2005 .

[12]  Nancy Ide,et al.  Knowledge Extraction from Machine-Readable Dictionaries: An Evaluation , 1993, EAMT Workshop.

[13]  Graeme Hirst,et al.  Ontology and the Lexicon , 2004, Handbook on Ontologies.

[14]  Sanda M. Harabagiu,et al.  The Informative Role of WordNet in Open-Domain Question Answering , 2004, HLT-NAACL 2004.

[15]  Martin Chodorow,et al.  Extracting Semantic Hierarchies from a Large On-Line Dictionary , 1985, ACL.

[16]  R. Lathe Phd by thesis , 1988, Nature.

[17]  Hugo Gonçalo Oliveira,et al.  Extracção de relações semânticas entre palavras a partir de um dicionário: o PAPEL e a sua avaliação , 2010, Linguamática.

[18]  Patrick Pantel,et al.  Concept Discovery from Text , 2002, COLING.

[19]  Markus Forsberg,et al.  Discovering semantic relations by means of unsupervised sense clustering Marianna Apidianaki 10 : 30 – 11 : 00 Coffee break 11 : 00 – 11 : 25 From the People ’ s Synonym Dictionary to fuzzy synsets – first steps , 2010 .

[20]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[21]  M. T. Lino,et al.  Proceedings of 4th International Conference on Language Resources and Evaluation (LREC) , 2004 .

[22]  Erik Velldal,et al.  A Fuzzy Clustering Approach to Word Sense Discrimination , 2005 .

[23]  Eckhard Bick,et al.  Providing Internet Access to Portuguese Corpora: the AC/DC Project , 2000, LREC.

[24]  B. Dorow A graph model for words and their meanings , 2006 .

[25]  Huang Chu-Ren,et al.  Wiktionary and NLP: improving synonymy networks , 2009, ACL 2009.

[26]  Wim Peters,et al.  Automatic sense clustering in eurowordnet , 1998, LREC.