Simple Semi-Supervised Learning for Prepositional Phrase Attachment

Prepositional phrase attachment is an important subproblem of parsing, performance on which suffers from limited availability of labelled data. We present a semi-supervised approach. We show that a discriminative lexical model trained from labelled data, and a generative lexical model learned via Expectation Maximization from unlabelled data can be combined in a product model to yield a PP-attachment model which is better than either is alone, and which outperforms the modern parser of Petrov and Klein (2007) by a significant margin. We show that, when learning from unlabelled data, it can be beneficial to model the generation of modifiers of a head collectively, rather than individually. Finally, we suggest that our pair of models will be interesting to combine using new techniques for discriminatively constraining EM.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Trevor Cohn,et al.  Logarithmic Opinion Pools for Conditional Random Fields , 2005, ACL.

[3]  Andrew Y. Ng,et al.  Learning random walk models for inducing word dependency distributions , 2004, ICML.

[4]  Michael Collins,et al.  Prepositional Phrase Attachment through a Backed-off Model , 1995, VLC@ACL.

[5]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[6]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7]  Patrick Pantel,et al.  An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words , 2000, ACL.

[8]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[9]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.

[10]  Makoto Nagao,et al.  Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary , 1997, VLC.

[11]  Martin Volk Combining Unsupervised and Supervised Methods for PP Attachment Disambiguation , 2002, COLING.

[12]  Dan Klein,et al.  Online EM for Unsupervised Models , 2009, NAACL.

[13]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[14]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[15]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[16]  R. Bordley A Multiplicative Formula for Aggregating Probability Assessments , 1982 .

[17]  Adwait Ratnaparkhi,et al.  Statistical Models for Unsupervised Prepositional Phrase Attachment , 1998, ACL.

[18]  Max Welling,et al.  Product of experts , 2007, Scholarpedia.

[19]  Andrew McCallum,et al.  High-Performance Semi-Supervised Learning using Discriminatively Constrained Generative Models , 2010, ICML.

[20]  Daniel Gildea,et al.  Corpus Variation and Parser Performance , 2001, EMNLP.

[21]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[22]  John A. Carroll,et al.  Applied morphological processing of English , 2001, Natural Language Engineering.

[23]  Hinrich Schütze,et al.  Prepositional Phrase Attachment without Oracles , 2007, Computational Linguistics.

[24]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[25]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[26]  Slav Petrov,et al.  Products of Random Latent Variable Grammars , 2010, NAACL.