论文信息 - Designing Agreement Features for Realization Ranking

Designing Agreement Features for Realization Ranking

This paper shows that incorporating linguistically motivated features to ensure correct animacy and number agreement in an averaged perceptron ranking model for CCG realization helps improve a state-of-the-art baseline even further. Traditionally, these features have been modelled using hard constraints in the grammar. However, given the graded nature of grammaticality judgements in the case of animacy we argue a case for the use of a statistical model to rank competing preferences. Though subject-verb agreement is generally viewed to be syntactic in nature, a perusal of relevant examples discussed in the theoretical linguistics literature (Kathol, 1999; Pollard and Sag, 1994) points toward the heterogeneous nature of English agreement. Compared to writing grammar rules, our method is more robust and allows incorporating information from diverse sources in realization. We also show that the perceptron model can reduce balanced punctuation errors that would otherwise require a post-filter. The full model yields significant improvements in BLEU scores on Section 23 of the CCGbank and makes many fewer agreement errors.

Michael White | Rajakrishnan Rajkumar

[1] Michael White,et al. Hypertagging: Supertagging for Surface Realization with CCG , 2008, ACL.

[2] Michael White,et al. A More Precise Analysis of Punctuation for Broad-Coverage Surface Realization with CCG , 2008, COLING 2008.

[3] Jason Baldridge,et al. Lexically specified derivational control in combinatory categorial grammar , 2002 .

[4] Jun'ichi Tsujii,et al. Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator , 2005, IWPT.

[5] Josef van Genabith,et al. Robust PCFG-Based Generation Using Automatically Acquired LFG Approximations , 2006, ACL.

[6] Stephan Oepen,et al. Maximum Entropy Models for Realization Ranking , 2005 .

[7] Brian Roark,et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm , 2004, ACL.

[8] Josef van Genabith,et al. Dependency-Based N-Gram Models for General Purpose Sentence Realisation , 2008, COLING.

[9] James R. Curran,et al. Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[10] Michael White,et al. Efficient Realization of Coordinate Structures in Combinatory Categorial Grammar , 2006 .

[11] Martin Kay,et al. Syntactic Process , 1979, ACL.