Feature Words Selection for Knowledge-based Word Sense Disambiguation with Syntactic Parsing

Feature words are crucial clues for word sense disambiguation. There are two methods to select feature words: window-based and dependency-based methods. Both of them have some shortcomings, such as irrelevant noise words or paucity of feature words. In order to solve the problems of the existing methods, this paper proposes two methods to select feature words with syntactic parsing, which are based on phrase structure parsing tree (PTree) and dependency parsing tree (DTree). With the help of syntactic parsing, the proposed methods can select feature words more accurately, which can alleviate the effect of noise words of window-based method and can avoid the paucity of feature words of dependency-based method. Evaluation is performed on a knowledge-based WSD system with a publicly available lexical sample dataset. The results show that both of the proposed methods are superior to window-based and dependency-based methods, and the method based on PTree is better than the method based on DTree. Both of them are preferred strategies to select feature words to disambiguate ambiguous words. Streszczenie. W artykule zaproponowano dwie metody selekcji cech slowa bazujące na analizie skladni struktury frazy oraz analizie skladni zalezności. Badania przeprowadzono przy wykorzystaniu roznych baz danych. Proponowana metoda ma wiekszą dokladnośc niz dotychczas stosowane metody: okna i zalezności. (Selekcja cech slowa dla jednoznacznego wykrywania znaczenia z syntaktyczną analizą skladni)

[1]  Ted Pedersen,et al.  WordNet::SenseRelate::AllWords - A Broad Coverage Word Sense Tagger that Maximizes Semantic Relatedness , 2009, NAACL.

[2]  Ted Pedersen,et al.  UMND1: Unsupervised Word Sense Disambiguation Using Contextual Semantic Relatedness , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[3]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[4]  Ted Pedersen,et al.  Unsupervised Corpus-Based Methods for WSD , 2007 .

[5]  Ted Pedersen,et al.  Using Measures of Semantic Relatedness for Word Sense Disambiguation , 2003, CICLing.

[6]  Diana McCarthy,et al.  Domain-Speci(cid:12)c Sense Distributions and Predominant Sense Acquisition , 2022 .

[7]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[8]  Julie Weeds,et al.  Unsupervised Acquisition of Predominant Word Senses , 2007, CL.

[9]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[10]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[11]  Philip Resnik,et al.  WSD in NLP Applications , 2007 .

[12]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[13]  Eneko Agirre,et al.  On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation , 2008, COLING.

[14]  Heyan Huang,et al.  Knowledge-based Word Sense Disambiguation with Feature Words Based on Dependency Relation and Syntax Tree , 2011 .

[15]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[16]  Wei Ding,et al.  A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge , 2009, HLT-NAACL.

[17]  German Rigau,et al.  Supervised Corpus-based Methods for Word Sense Disambiguation , 2006 .