论文信息 - Zipfian corruptions for robust POS tagging

Zipfian corruptions for robust POS tagging

Inspired by robust generalization and adversarial learning we describe a novel approach to learning structured perceptrons for part-ofspeech (POS) tagging that is less sensitive to domain shifts. The objective of our method is to minimize average loss under random distribution shifts. We restrict the possible target distributions to mixtures of the source distribution and random Zipfian distributions. Our algorithm is used for POS tagging and evaluated on the English Web Treebank and the Danish Dependency Treebank with an average 4.4% error reduction in tagging accuracy.

Anders Søgaard | Anders Søgaard

[1] John Blitzer,et al. Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[2] Slav Petrov,et al. A Universal Part-of-Speech Tagset , 2011, LREC.

[3] G. Āllport. The Psycho-Biology of Language. , 1936 .

[4] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[5] G. Zipf,et al. The Psycho-Biology of Language , 1936 .

[6] M. Trautner,et al. The Danish Dependency Treebank and the DTAG Treebank Tool , 2003 .

[7] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[8] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.

[9] Ohad Shamir,et al. Learning to classify with missing and corrupted features , 2008, ICML '08.

[10] Arkadi Nemirovski,et al. Robust Convex Optimization , 1998, Math. Oper. Res..

[11] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.