论文信息 - A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction - 字舞流文

A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

In this work we address the problem of unsupervised part-of-speech induction by bringing together several strands of research into a single model. We develop a novel hidden Markov model incorporating sophisticated smoothing using a hierarchical Pitman-Yor processes prior, providing an elegant and principled means of incorporating lexical characteristics. Central to our approach is a new type-based sampling algorithm for hierarchical Pitman-Yor models in which we track fractional table counts. In an empirical evaluation we show that our model consistently out-performs the current state-of-the-art across 10 languages.

Phil Blunsom | Trevor Cohn | P. Blunsom | Trevor Cohn | Phil Blunsom

[1] Thomas L. Griffiths,et al. A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[2] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[3] Jianfeng Gao,et al. A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers , 2008, EMNLP.

[4] Franz Josef Och,et al. An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[5] Noah A. Smith,et al. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[6] Mark Steedman,et al. Two Decades of Unsupervised POS Induction: How Far Have We Come? , 2010, EMNLP.

[7] Dan Klein,et al. Prototype-Driven Learning for Sequence Models , 2006, NAACL.

[8] Mark Johnson,et al. Why Doesn’t EM Find Good HMM POS-Taggers? , 2007, EMNLP.

[9] Sabine Buchholz,et al. CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[10] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11] John DeNero,et al. Painless Unsupervised Learning with Features , 2010, NAACL.

[12] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[13] Alexander Clark,et al. Combining Distributional and Morphological Information for Part of Speech Induction , 2003, EACL.

[14] Kevin Knight,et al. Minimized Models for Unsupervised Part-of-Speech Tagging , 2009, ACL.

[15] Ben Taskar,et al. Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[16] Mark Johnson,et al. A Bayesian LDA-based model for semi-supervised part-of-speech tagging , 2007, NIPS.

[17] Yee Whye Teh,et al. A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[18] Regina Barzilay,et al. Simple Type-Level Unsupervised POS Tagging , 2010, EMNLP.

[19] Thomas L. Griffiths,et al. Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.

[20] Dan Klein,et al. Type-Based MCMC , 2010, HLT-NAACL.

[21] Phil Blunsom,et al. Inducing Tree-Substitution Grammars , 2010, J. Mach. Learn. Res..

[22] Thomas L. Griffiths,et al. Interpolating between types and tokens by estimating power-law generators , 2005, NIPS.