Learning PP attachment from corpus statistics

One of the main problems in natural language analysis is the resolution of structural ambiguity. Prepositional Phrase (PP) attachment ambiguity is a particularly difficult case. We describe a robust PP disambiguation procedure that learns from a text corpus. The method is based on a loglinear model, a type of statistical model that is able to account for combinations of multiple categorial features. A series of experiments that compare the loglinear method against other strategies are described. For the difficult case of three possible attachment sites, the loglinear method predicts PP attachment with significantly higher accuracy than a simpler procedure that uses lexical association strengths. At the same time, on general newswire text, the accuracy of the statistical method remains 10% below the performance of human experts. This suggests a limit on what can be learned automatically from text, and points to the need to combine machine learning with human expertise.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Hans Brunner,et al.  Empirical Study of Predictive Powers of Simple Attachment Schemes for Post-modifier Prepositional Phrases , 1990, ACL.

[3]  Graeme Hirst,et al.  Semantic Interpretation and the Resolution of Ambiguity , 1987, Studies in natural language processing.

[4]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[5]  Rajeev Agarwal,et al.  Disambiguation of Prepositional Phrases in Automatically Labelled Technical Text , 1991, AAAI.

[6]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.

[7]  Lyn Frazier,et al.  ON COMPREHENDING SENTENCES: SYNTACTIC PARSING STRATEGIES. , 1979 .

[8]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[9]  Philip Resnik,et al.  Structural Ambiguity and Conceptual Relations , 1993, VLC@ACL.

[10]  Stephen E. Fienberg,et al.  The analysis of cross-classified categorical data , 1980 .

[11]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[12]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[13]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[14]  Lyn Frazier,et al.  Sentence processing: A tutorial review. , 1987 .

[15]  M. Baltin,et al.  The Mental representation of grammatical relations , 1985 .

[16]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[17]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[18]  Mark Steedman,et al.  On not being led up the garden path : The use of context by the psychological syntax processor , 1985 .