论文信息 - Unsupervised Dependency Parsing without Gold Part-of-Speech Tags - 字舞流文

Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags --- requiring a word to always have the same part-of-speech significantly degrades the performance of manual tags in grammar induction, eliminating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different contexts. With these new induced tags as input, our state-of-the-art dependency grammar inducer achieves 59.1% directed accuracy on Section 23 (all sentences) of the Wall Street Journal (WSJ) corpus --- 0.7% higher than using gold tags.

Valentin I. Spitkovsky | Angel X. Chang | Daniel Jurafsky | Hiyan Alshawi | Dan Jurafsky | H. Alshawi

[1] Glenn Carroll,et al. Two Experiments on Learning Probabilistic Dependency Grammars from Corpora , 1992 .

[2] Hiyan Alshawi. Head automata for speech translation , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3] Bart Selman,et al. Noise Strategies for Improving Local Search , 1994, AAAI.

[4] Slav Petrov,et al. Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections , 2011, ACL.

[5] Julian M. Kupiec,et al. Robust part-of-speech tagging using a hidden Markov model , 1992 .

[6] Yoav Seginer,et al. Fast Unsupervised Incremental Parsing , 2007, ACL.

[7] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[8] Jianfeng Gao,et al. A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers , 2008, EMNLP.

[9] Deniz Yuret,et al. Discovery of linguistic relations using lexical attraction , 1998, ArXiv.

[10] Bernard Mérialdo,et al. Tagging English Text with a Probabilistic Model , 1994, CL.

[11] Slav Petrov,et al. Uptraining for Accurate Deterministic Question Parsing , 2010, EMNLP.

[12] Roy Schwartz,et al. Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation , 2011, ACL.

[13] Slav Petrov,et al. Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[14] J. Baker. Trainable grammars for speech recognition , 1979 .

[15] Ari Rappoport,et al. Improved Fully Unsupervised Parsing with Zoomed Learning , 2010, EMNLP.

[16] Y. Seginer,et al. Learning syntactic structure , 2007 .

[17] Hinrich Schütze,et al. Distributional Part-of-Speech Tagging , 1995, EACL.

[18] Ari Rappoport,et al. Improved Unsupervised POS Induction through Prototype Discovery , 2010, ACL.

[19] Bart Cramer,et al. Limitations of Current Grammar Induction Algorithms , 2007, ACL.

[20] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[21] Valentin I. Spitkovsky,et al. Punctuation: Making a Point in Unsupervised Dependency Parsing , 2011, CoNLL.

[22] Mark Steedman,et al. Two Decades of Unsupervised POS Induction: How Far Have We Come? , 2010, EMNLP.

[23] Sebastian Riedel,et al. The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[24] Christopher D. Manning,et al. The unsupervised learning of natural language structure , 2005 .

[25] Terry Koo,et al. Advances in discriminative dependency parsing , 2010 .

[26] Slav Petrov,et al. A Universal Part-of-Speech Tagset , 2011, LREC.

[27] Michele Banko,et al. Part-of-Speech Tagging in Context , 2004, COLING.

[28] Srinivas Bangalore,et al. Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[29] Federico Sangati,et al. Unsupervised Methods for Head Assignments , 2009, EACL.

[30] Richard Johansson,et al. Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[31] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[32] Christopher D. Manning,et al. Joint Parsing and Named Entity Recognition , 2009, NAACL.

[33] David Chiang,et al. Recovering Latent Information in Treebanks , 2002, COLING.

[34] Mark Johnson,et al. Improving Unsupervised Dependency Parsing with Richer Contexts and Smoothing , 2009, NAACL.

[35] Alexander Clark,et al. Inducing Syntactic Categories by Context Distribution Clustering , 2000, CoNLL/LLL.

[36] Valentin I. Spitkovsky,et al. Baby Steps: How “Less is More” in Unsupervised Dependency Parsing , 2009 .

[37] Valentin I. Spitkovsky,et al. Viterbi Training Improves Unsupervised Dependency Parsing , 2010, CoNLL.

[38] Yuji Matsumoto,et al. Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[39] Dan Klein,et al. Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[40] Valentin I. Spitkovsky,et al. From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing , 2010, NAACL.

[41] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.

[42] Christopher D. Manning,et al. The Infinite Tree , 2007, ACL.

[43] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[44] Naftali Tishby,et al. Distributional Clustering of English Words , 1993, ACL.

[45] Rens Bod,et al. An All-Subtrees Approach to Unsupervised Parsing , 2006, ACL.

[46] David M. Magerman. Statistical Decision-Tree Models for Parsing , 1995, ACL.

[47] Eugene Charniak,et al. Evaluating Unsupervised Part-of-Speech Tagging for Grammar Induction , 2008, COLING.

[48] Hiyan Alshawi,et al. Deterministic Statistical Mapping of Sentences to Underspecified Semantics , 2011, IWCS.

[49] Mark A. Paskin,et al. Grammatical Bigrams , 2001, NIPS.