Density Maximization in Context-Sense Metric Space for All-words WSD

This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. Experimental results confirmed the superiority of the proposed method over conventional ones by showing the better performances beyond most-frequent-sense baseline performance where none of SemEval2 unsupervised systems reached.

[1]  Jean Véronis,et al.  HyperLex: lexical cartography for information retrieval , 2004, Comput. Speech Lang..

[2]  Eneko Agirre,et al.  Word Sense Disambiguation: Algorithms and Applications , 2007 .

[3]  Ted Pedersen,et al.  UMND1: Unsupervised Word Sense Disambiguation Using Contextual Semantic Relatedness , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[4]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[5]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[6]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[7]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[8]  Piek T. J. M. Vossen,et al.  Kyoto: An Integrated System for Specific Domain WSD , 2010, SemEval@ACL.

[9]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[10]  Piek T. J. M. Vossen,et al.  SemEval-2010 Task 17: All-Words Word Sense Disambiguation on a Specific Domain , 2009, *SEMEVAL.

[11]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[12]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[13]  Технология Springer Science+Business Media , 2013 .

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[16]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[17]  Pushpak Bhattacharyya,et al.  CFILT: Resource Conscious Approaches for All-Words Domain Specific WSD , 2010, SemEval@ACL.

[18]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[19]  Julie Weeds,et al.  Unsupervised Acquisition of Predominant Word Senses , 2007, CL.

[20]  Wei Ding,et al.  TreeMatch: A Fully Unsupervised WSD System Using Dependency Knowledge on a Specific Domain , 2010, SemEval@ACL.

[21]  Mirella Lapata,et al.  Graph Connectivity Measures for Unsupervised Word Sense Disambiguation , 2007, IJCAI.

[22]  Dong-Hong Ji,et al.  Word Sense Disambiguation Using Label Propagation Based Semi-Supervised Learning , 2005, ACL.

[23]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.