论文信息 - SVD and Clustering for Unsupervised POS Tagging

SVD and Clustering for Unsupervised POS Tagging

We revisit the algorithm of Schutze (1995) for unsupervised part-of-speech tagging. The algorithm uses reduced-rank singular value decomposition followed by clustering to extract latent features from context distributions. As implemented here, it achieves state-of-the-art tagging accuracy at considerably less cost than more recent methods. It can also produce a range of finer-grained taggings, with potential applications to various tasks.

Mark Johnson | Elie Bienenstock | Michael Lamar | Yariv Maron

[1] Ari Rappoport,et al. The NVI Clustering Evaluation Measure , 2009, CoNLL.

[2] Thomas L. Griffiths,et al. A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[3] Marina Meila,et al. Comparing Clusterings by the Variation of Information , 2003, COLT.

[4] Jianfeng Gao,et al. A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers , 2008, EMNLP.

[5] Dan Klein,et al. Prototype-Driven Learning for Sequence Models , 2006, NAACL.

[6] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[7] Eugene Charniak,et al. Evaluating Unsupervised Part-of-Speech Tagging for Grammar Induction , 2008, COLING.

[8] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[9] Alexander Clark,et al. Combining Distributional and Morphological Information for Part of Speech Induction , 2003, EACL.

[10] Alexander Clark,et al. Inducing Syntactic Categories by Context Distribution Clustering , 2000, CoNLL/LLL.

[11] Mark Johnson,et al. Why Doesn’t EM Find Good HMM POS-Taggers? , 2007, EMNLP.

[12] Noah A. Smith,et al. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[13] Ben Taskar,et al. Posterior vs Parameter Sparsity in Latent Variable Models , 2009, NIPS.

[14] Dan Klein,et al. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.