论文信息 - Supervised Domain Adaption for WSD

Supervised Domain Adaption for WSD

The lack of positive results on supervised domain adaptation for WSD have cast some doubts on the utility of hand-tagging general corpora and thus developing generic supervised WSD systems. In this paper we show for the first time that our WSD system trained on a general source corpus (Bnc) and the target corpus, obtains up to 22% error reduction when compared to a system trained on the target corpus alone. In addition, we show that as little as 40% of the target corpus (when supplemented with the source corpus) is sufficient to obtain the same results as training on the full target data. The key for success is the use of unlabeled data with svd, a combination of kernels and svm.

Eneko Agirre | Oier Lopez de Lacalle | Eneko Agirre

[1] Geoffrey Leech,et al. 100 Million Words of English:The British National Corpus (BNC) , 1992 .

[2] Jordi Girona Salgado. An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems , 2000 .

[3] Martha Palmer,et al. SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[4] Eneko Agirre,et al. On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation , 2008, COLING.

[5] Daniel Marcu,et al. Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[6] Alex Acero,et al. Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[7] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.

[8] Hwee Tou Ng,et al. Word Sense Disambiguation Using OntoNotes: An Empirical Study , 2008, EMNLP.

[9] Diana McCarthy,et al. Domain-Speci(cid:12)c Sense Distributions and Predominant Sense Acquisition , 2022 .

[10] Eneko Agirre,et al. Exploring feature spaces with svd and unlabeled data for Word Sense Disambiguation , 2005 .

[11] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.