论文信息 - On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation

On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation

In this paper we explore robustness and domain adaptation issues for Word Sense Disambiguation (WSD) using Singular Value Decomposition (SVD) and unlabeled data. We focus on the semi-supervised domain adaptation scenario, where we train on the source corpus and test on the target corpus, and try to improve results using unlabeled data. Our method yields up to 16.3% error reduction compared to state-of-the-art systems, being the first to report successful semi-supervised domain adaptation. Surprisingly the improvement comes from the use of unlabeled data from the source corpus, and not from the target corpora, meaning that we get robustness rather than domain adaptation. In addition, we study the behavior of our system on the target domain.

Eneko Agirre | Oier Lopez de Lacalle | Eneko Agirre

[1] Eneko Agirre,et al. Exploring feature spaces with svd and unlabeled data for Word Sense Disambiguation , 2005 .

[2] Jordi Girona Salgado. An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems , 2000 .

[3] Eneko Agirre,et al. UBC-ALM: Combining k-NN with SVD for WSD , 2007, SemEval@ACL.

[4] Geoffrey Leech,et al. 100 Million Words of English:The British National Corpus (BNC) , 1992 .

[5] Grace Ngai,et al. Transformation Based Learning in the Fast Lane , 2001, NAACL.

[6] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[7] Ted Pedersen,et al. A Decision Tree of Bigrams is an Accurate Predictor of Word Sense , 2001, NAACL.

[8] S. T. Buckland,et al. Computer-Intensive Methods for Testing Hypotheses. , 1990 .

[9] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .

[10] Diana McCarthy,et al. Domain-Speci(cid:12)c Sense Distributions and Predominant Sense Acquisition , 2022 .

[11] Carlo Strapparava,et al. Domain Kernels for Word Sense Disambiguation , 2005, ACL.