Latent-Variable PCFGs: Background and Applications

Latent-variable probabilistic context-free grammars are latent-variable models that are based on context-free grammars. Nonterminals are associated with latent states that provide contextual information during the top-down rewriting process of the grammar. We survey a few of the techniques used to estimate such grammars and to parse text with them. We also give an overview of what the latent states represent for English Penn treebank parsing, and provide an overview of extensions and related models to these grammars.

[1]  Khalil Sima'an,et al.  Reordering Grammar Induction , 2015, EMNLP.

[2]  G. Strang Introduction to Linear Algebra , 1993 .

[3]  Slav Petrov,et al.  Products of Random Latent Variable Grammars , 2010, NAACL.

[4]  Detlef Prescher,et al.  Inducing Head-Driven PCFGs with Latent Heads: Refining a Tree-Bank Grammar for Parsing , 2005, ECML.

[5]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[6]  Lidia S. Chao,et al.  Chinese Unknown Word Recognition for PCFG-LA Parsing , 2014, TheScientificWorldJournal.

[7]  Phong Le,et al.  The Inside-Outside Recursive Neural Network model for Dependency Parsing , 2014, EMNLP.

[8]  Gerald Penn,et al.  Accurate Context-Free Parsing with Combinatory Categorial Grammar , 2010, ACL.

[9]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[10]  Giorgio Satta,et al.  Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars , 1999, ACL.

[11]  Shashi Narayan,et al.  Optimizing Spectral Learning for Parsing , 2016, ACL.

[12]  Joakim Nivre,et al.  Benchmarking of Statistical Dependency Parsers for French , 2010, COLING.

[13]  Agnieszka Falenska,et al.  Introducing the IMS-Wrocław-Szeged-CIS entry at the SPMRL 2014 Shared Task: Reranking and Morpho-syntax meet Unlabeled Data , 2014 .

[14]  Kevin Knight,et al.  An Overview of Probabilistic Tree Transducers for Natural Language Processing , 2005, CICLing.

[15]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[16]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[17]  Shashi Narayan,et al.  Diversity in Spectral Learning for Natural Language Parsing , 2015, EMNLP.

[18]  Josef van Genabith,et al.  Morphological Features for Parsing Morphologically-rich Languages: A Case of Arabic , 2011, SPMRL@IWPT.

[19]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[20]  Shay B. Cohen,et al.  Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs , 2012, NIPS.

[21]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[22]  Jun'ichi Tsujii,et al.  Probabilistic CFG with Latent Annotations , 2005, ACL.

[23]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[24]  Ariadna Quattoni,et al.  Spectral Learning for Non-Deterministic Dependency Parsing , 2012, EACL.

[25]  Markus Dreyer,et al.  Better Informed Training of Latent Syntactic Features , 2006, EMNLP.

[26]  Michael Collins,et al.  Spectral Dependency Parsing with Latent Variables , 2012, EMNLP-CoNLL.

[27]  Christopher D. Manning,et al.  Better Arabic Parsing: Baselines, Evaluations, and Analysis , 2010, COLING.

[28]  Mary P. Harper,et al.  Self-Training PCFG Grammars with Latent Annotations Across Languages , 2009, EMNLP.

[29]  Shay B. Cohen,et al.  Latent-Variable Synchronous CFGs for Hierarchical Translation , 2014, EMNLP.

[30]  Jason Baldridge,et al.  Parsing low-resource languages using Gibbs sampling for PCFGs with latent annotations , 2014, EMNLP.

[31]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[32]  Christopher D. Manning,et al.  Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[33]  Josef van Genabith,et al.  Handling Unknown Words in Statistical Latent-Variable Parsing Models for Arabic, English and French , 2010, SPMRL@NAACL-HLT.

[34]  Shay B. Cohen,et al.  Conversation Trees: A Grammar Model for Topic Structure in Forums , 2015, EMNLP.

[35]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[36]  Joshua Goodman,et al.  Parsing Algorithms and Metrics , 1996, ACL.

[37]  Josef van Genabith,et al.  Arabic Parsing Using Grammar Transforms , 2010, LREC.

[38]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[39]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[40]  Yoav Goldberg,et al.  Joint Hebrew Segmentation and Parsing using a PCFGLA Lattice Parser , 2011, ACL.

[41]  Mark-Jan Nederhof,et al.  Transition-based dependency parsing as latent-variable constituent parsing , 2016 .

[42]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[43]  Y. Aloimonos,et al.  Discovering a Language for Human Activity 1 , 2005 .

[44]  Karl Stratos,et al.  Experiments with Spectral Learning of Latent-Variable PCFGs , 2013, HLT-NAACL.

[45]  Karl Stratos,et al.  Spectral learning of latent-variable PCFGs: algorithms and sample complexity , 2014, J. Mach. Learn. Res..

[46]  Jake Porway,et al.  A stochastic graph grammar for compositional object representation and recognition , 2009, Pattern Recognit..

[47]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[48]  Mark-Jan Nederhof,et al.  A Derivational Model of Discontinuous Parsing , 2017, LATA.

[49]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[50]  Amaury Habrard,et al.  A Spectral Approach for Probabilistic Grammatical Inference on Trees , 2010, ALT.

[51]  Dan Klein,et al.  Discriminative Log-Linear Grammars with Latent Variables , 2007, NIPS.

[52]  Isabel Trancoso,et al.  Lexicon expansion for latent variable grammars , 2014, Pattern Recognit. Lett..

[53]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[54]  Mary P. Harper,et al.  Self-Training with Products of Latent Variable Grammars , 2010, EMNLP.