Applying a Naive Bayes Similarity Measure to Word Sense Disambiguation

We replace the overlap mechanism of the Lesk algorithm with a simple, generalpurpose Naive Bayes model that measures many-to-many association between two sets of random variables. Even with simple probability estimates such as maximum likelihood, the model gains significant improvement over the Lesk algorithm on word sense disambiguation tasks. With additional lexical knowledge from WordNet, performance is further improved to surpass the state-of-the-art results.

[1]  Mark A. Finlayson Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation , 2014, GWC.

[2]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[3]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[4]  Adam Kilgarriff,et al.  WASP-Bench: a Lexicographic Tool Supporting Word Sense Disambiguation , 2001, SENSEVAL@ACL.

[5]  Ted Pedersen,et al.  Using Measures of Semantic Relatedness for Word Sense Disambiguation , 2003, CICLing.

[6]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[7]  Ted Pedersen,et al.  A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation , 2000, ANLP.

[8]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[9]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[10]  Rada Mihalcea,et al.  PageRank on Semantic Networks, with Application to Word Sense Disambiguation , 2004, COLING.

[11]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[12]  Shaul Markovitch,et al.  Concept-Based Approach to Word-Sense Disambiguation , 2012, AAAI.

[13]  Wei Ding,et al.  A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge , 2009, HLT-NAACL.

[14]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[15]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[16]  Kenneth C. Litkowski Sense Information for Disambiguation: Confluence of Supervised and Unsupervised Methods , 2002, SENSEVAL.

[17]  Yorick Wilks,et al.  Providing machine tractable dictionary tools , 1990, Machine Translation.

[18]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[19]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[20]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[21]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[22]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[23]  Andrew Skabar,et al.  Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance , 2012, TSLP.

[24]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[25]  Julio Gonzalo,et al.  The UNED Systems at SENSEVAL-2 , 2001, *SEMEVAL.

[26]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[27]  Iryna Gurevych,et al.  Using Distributional Similarity for Lexical Expansion in Knowledge-based Word Sense Disambiguation , 2012, COLING.

[28]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[29]  Philippe Langlais,et al.  Evaluating Variants of the Lesk Approach for Disambiguating Words , 2004, LREC.

[30]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..