A Bayesian Model for Generative Transition-based Dependency Parsing

We propose a simple, scalable, fully generative model for transition-based dependency parsing with high accuracy. The model, parameterized by Hierarchical Pitman-Yor Processes, overcomes the limitations of previous generative models by allowing fast and accurate inference. We propose an efficient decoding algorithm based on particle filtering that can adapt the beam size to the uncertainty in the model while jointly predicting POS tags and parse trees. The UAS of the parser is on par with that of a greedy discriminative baseline. As a language model, it obtains better perplexity than a n-gram model by performing semi-supervised learning over a large unlabelled corpus. We show that the model is able to generate locally and syntactically coherent sentences, opening the door to further applications in language generation.

[1]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[2]  Phil Blunsom,et al.  Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing , 2010, EMNLP.

[3]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[4]  Joakim Nivre,et al.  Deterministic Dependency Parsing of English Text , 2004, COLING.

[5]  Joakim Nivre,et al.  A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012, EMNLP.

[6]  Thomas L. Griffiths,et al.  Modeling the effects of memory on human online sentence processing with particle filters , 2008, NIPS.

[7]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[8]  Frederick Jelinek,et al.  Structured Language Modeling for Speech Recognition , 2000, ArXiv.

[9]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[10]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[11]  Ivan Titov,et al.  A Latent Variable Model for Generative Dependency Parsing , 2007, Trends in Parsing Technology.

[12]  Matt Post,et al.  Syntax-based language models for statistical machine translation , 2010 .

[13]  Phil Blunsom,et al.  A Note on the Implementation of Hierarchical Dirichlet Processes , 2009, ACL/IJCNLP.

[14]  Giorgio Satta,et al.  Exact Inference for Generative Probabilistic Non-Projective Dependency Parsing , 2011, EMNLP.

[15]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[16]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[17]  Slav Petrov,et al.  Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[18]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[19]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[20]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[21]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[22]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[23]  Eugene Charniak,et al.  Immediate-Head Parsing for Language Models , 2001, ACL.

[24]  Ahmad Emami,et al.  A Neural Syntactic Language Model , 2005, Machine Learning.

[25]  Valentin I. Spitkovsky,et al.  Viterbi Training Improves Unsupervised Dependency Parsing , 2010, CoNLL.

[26]  Brian Roark,et al.  Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.

[27]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[28]  Joakim Nivre,et al.  Training Deterministic Parsers with Non-Deterministic Oracles , 2013, TACL.

[29]  Andrew McCallum,et al.  Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors , 2008 .

[30]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[31]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[32]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[33]  Dan Klein,et al.  Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[34]  Phong Le,et al.  The Inside-Outside Recursive Neural Network model for Dependency Parsing , 2014, EMNLP.

[35]  Noah A. Smith,et al.  Annealing Structural Bias in Multilingual Weighted Grammar Induction , 2006, ACL.

[36]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[37]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[38]  Andrew McCallum,et al.  Transition-based Dependency Parsing with Selectional Branching , 2013, ACL.

[39]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[40]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[41]  Kenji Yamada,et al.  Syntax-based language models for statistical machine translation , 2003, ACL 2003.

[42]  Mark Dredze,et al.  Efficient Structured Language Modeling for Speech Recognition , 2012, INTERSPEECH.

[43]  Regina Barzilay,et al.  Low-Rank Tensors for Scoring Dependency Structures , 2014, ACL.

[44]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[45]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[46]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .