论文信息 - Improving Word Sense Disambiguation with Linguistic Knowledge from a Sense Annotated Treebank

Improving Word Sense Disambiguation with Linguistic Knowledge from a Sense Annotated Treebank

In this paper we present an approach for the enrichment of WSD knowledge bases with data-driven relations from a gold standard corpus (annotated with word senses, valency information, syntactic analyses, etc.). We focus on Bulgarian as a use case, but our approach is scalable to other languages as well. For the purpose of exploring such methods, the Personalized Page Rank algorithm was used. The reported results show that the addition of new knowledge improves the accuracy of WSD with approximately 10.5%.

Kiril Ivanov Simov | Petya Osenova | Alexander Popov

[1] Eneko Agirre,et al. Integrating selectional preferences in WordNet , 2002, ArXiv.

[2] Eneko Agirre,et al. Random Walks for Knowledge-Based Word Sense Disambiguation , 2014, CL.

[3] Eneko Agirre,et al. Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[4] Eneko Agirre,et al. Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation , 2015, ArXiv.

[5] G. Rigau,et al. Combining Knowledge- and Corpus-based Word-Sense-Disambiguation Methods , 2005, J. Artif. Intell. Res..

[6] Kiril Ivanov Simov,et al. A Treebank-driven Creation of an OntoValence Verb lexicon for Bulgarian , 2012, LREC.

[7] Simone Paolo Ponzetto,et al. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[8] Sergey Brin,et al. Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.