A Quantitative Assessment of SENSATIONAL with an Exploration of Its Applications

Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. This is a significant problem in the biomedical domain where a single word may be used to describe a gene, protein, or abbreviation. In this paper, we evaluate SENSATIONAL, a novel unsupervised WSD technique, in comparison with two popular learning algorithms, support vector machines (SVM) and K means. Based on the accuracy measure, our results show that SENSATIONAL outperforms SVM and K means by 2% and 17% respectively. In addition, we develop a polysemy based search engine and an experimental visualization application that utilizes SENSATIONAL clustering technique.

[1]  Halil Kilicoglu,et al.  Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment , 2006 .

[2]  Ari Rappoport,et al.  Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions , 2008, ACL.

[3]  Andrew Harley,et al.  Sense Tagging in Action Combining Different Tests with Additive Weighangs , 2002 .

[4]  Hongfang Liu,et al.  Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues , 2006, BMC Bioinformatics.

[5]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Ted Pedersen,et al.  A Comparative Study of Support Vector Machines Applied to the Supervised Word Sense Disambiguation Problem in the Medical Domain , 2005, IICAI.

[8]  Min Song,et al.  Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts , 2009, BMC Bioinformatics.

[9]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[10]  Hongfang Liu,et al.  Research Paper: Automatic Resolution of Ambiguous Terms Based on Machine Learning and Conceptual Relations in the UMLS , 2002, J. Am. Medical Informatics Assoc..

[11]  Graeme Hirst,et al.  Automatic Sense Disambiguation of the Near-Synonyms in a Dictionary Entry , 2003, CICLing.

[12]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[13]  Thomas C. Rindflesch,et al.  Effects of information and machine learning algorithms on word sense disambiguation with small datasets , 2005, Int. J. Medical Informatics.

[14]  Yoshua Bengio,et al.  Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models , 2004, ACL.

[15]  Ted Pedersen,et al.  Knowledge Lean Word-Sense Disambiguation , 1997, AAAI/IAAI.

[16]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[17]  Paul Buitelaar,et al.  Unsupervised Monolingual and Bilingual Word-Sense Disambiguation of Medical Documents using UMLS , 2003, BioNLP@ACL.

[18]  John Tait,et al.  Word sense disambiguation in information retrieval revisited , 2003, SIGIR.

[19]  Nancy Ide,et al.  Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art , 1998, Comput. Linguistics.

[20]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[21]  Rada Mihalcea,et al.  Word Sense Disambiguation based on Semantic Density , 1998, WordNet@ACL/COLING.

[22]  Sherry Koshman,et al.  Visualization-based information retrieval on the Web , 2006 .

[23]  Philip Resnik,et al.  An Unsupervised Method for Word Sense Tagging using Parallel Corpora , 2002, ACL.

[24]  Daphne Koller,et al.  Word-Sense Disambiguation for Machine Translation , 2005, HLT.

[25]  Lluís Màrquez i Villodre,et al.  Boosting Applied to Word Sense Disambiguation , 2000, ArXiv.

[26]  Yorick Wilks,et al.  Providing machine tractable dictionary tools , 1990, Machine Translation.

[27]  Adam Kilgarriff,et al.  English Senseval: Report and Results , 2000, LREC.

[28]  Sanda M. Harabagiu,et al.  The Informative Role of WordNet in Open-Domain Question Answering , 2004, HLT-NAACL 2004.

[29]  Robert R. Korfhage,et al.  To see, or not to see— is That the query? , 1991, SIGIR '91.