Object-Extraction and Question-Parsing using CCG

Accurate dependency recovery has recently been reported for a number of wide-coverage statistical parsers using Combinatory Categorial Grammar (CCG). However, overall figures give no indication of a parser’s performance on specific constructions, nor how suitable a parser is for specific applications. In this paper we give a detailed evaluation of a CCG parser on object extraction dependencies found in WSJ text. We also show how the parser can be used to parse questions for Question Answering. The accuracy of the original parser on questions is very poor, and we propose a novel technique for porting the parser to a new domain, by creating new labelled data at the lexical category level only. Using a supertagger to assign categories to words, trained on the new data, leads to a dramatic increase in question parsing accuracy.

[1]  Ted Briscoe,et al.  Robust Accurate Statistical Annotation of General Text , 2002, LREC.

[2]  Mark Steedman,et al.  Surface structure and interpretation , 1996, Linguistic inquiry.

[3]  Beth Ann Hockey,et al.  Maintaining the Forest and Burning out the Underbrush in XTAG , 1997 .

[4]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[5]  Mark Johnson,et al.  A Simple Pattern-matching Algorithm for Recovering Empty Nodes and their Antecedents , 2002, ACL.

[6]  James R. Curran,et al.  Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[7]  Mark Steedman,et al.  Wide-Coverage Semantic Representations from a CCG Parser , 2004, COLING.

[8]  Julia Hockenmaier Parsing with Generative Models of Predicate-Argument Structure , 2003, ACL.

[9]  Sanda M. Harabagiu,et al.  High performance question/answering , 2001, SIGIR '01.

[10]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[11]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[12]  Amit Dubey,et al.  Deep Syntactic Processing by Combining Shallow Methods , 2003, ACL.

[13]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[14]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[15]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[16]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[17]  Mark Steedman,et al.  Building Deep Dependency Structures using a Wide-Coverage CCG Parser , 2002, ACL.