Coordinate Noun Phrase Disambiguation in a Generative Parsing Model

In this paper we present methods for improving the disambiguation of noun phrase (NP) coordination within the framework of a lexicalised history-based parsing model. As well as reducing noise in the data, we look at modelling two main sources of information for disambiguation: symmetry in conjunct structure, and the dependency between conjunct lexical heads. Our changes to the baseline model result in an increase in NP coordination dependency f-score from 69.9% to 73.8%, which represents a relative reduction in f-score error of 13%.

[1]  Beatrice Santorini,et al.  Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd Revision) , 1990 .

[2]  Beatrice Santorini Part-of-speech tagging guidelines for the penn treebank project , 1990 .

[3]  Rajeev Agarwal,et al.  A Simple but Useful Approach to Conjunct Identification , 1992, ACL.

[4]  Makoto Nagao,et al.  A Syntactic Analysis Method of Long Japanese Sentences Based on the Detection of Conjunctive Structures , 1994, CL.

[5]  Adwait Ratnaparkhi,et al.  A maximum entropy model for parsing , 1994, ICSLP.

[6]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[7]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[8]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[9]  Miriam Goldberg,et al.  An Unsupervised Model for Statistically Determining Coordinate Phrase Attachment , 1999, ACL.

[10]  Sharon A. Caraballo Automatic construction of a hypernym-labeled noun hierarchy from text , 1999, ACL.

[11]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[12]  Dekang Lin,et al.  A Nearest-Neighbor Method for Resolving PP-Attachment Ambiguity , 2004, IJCNLP.

[13]  Mitchell P. Marcus,et al.  On the parameter space of generative lexicalized statistical parsing models , 2004 .

[14]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[15]  Frank Keller,et al.  Parallelism in Coordination as an Instance of Syntactic Priming: Evidence from Corpus-based Modeling , 2005, HLT.

[16]  Markus Dickinson Prune Diseased Branches to Get Healthy Trees ! How to Find Erroneous Local Trees in a Treebank and Why It Matters , 2005 .

[17]  Deirdre Hogan k-NN for Local Probability Estimation in Generative Parsing Models , 2005, IWPT.

[18]  Preslav Nakov,et al.  Using the Web as an Implicit Training Set: Application to Structural Ambiguity Resolution , 2005, HLT.

[19]  Dominic Widdows,et al.  Geometry and Meaning , 2004, Computational Linguistics.