The Importance of Supertagging for Wide-Coverage CCG Parsing

This paper describes the role of supertagging in a wide-coverage CCG parser which uses a log-linear model to select an analysis. The supertagger reduces the derivation space over which model estimation is performed, reducing the space required for discriminative training. It also dramatically increases the speed of the parser. We show that large increases in speed can be obtained by tightly integrating the supertagger with the CCG grammar and parser. This is the first work we are aware of to successfully integrate a supertagger with a full parser which uses an automatically extracted grammar. We also further reduce the derivation space using constraints on category combination. The result is an accurate wide-coverage CCG parser which is an order of magnitude faster than comparable systems for other linguistically motivated formalisms.

[1]  Mark Steedman,et al.  Building Deep Dependency Structures using a Wide-Coverage CCG Parser , 2002, ACL.

[2]  Srinivas Bangalore,et al.  New Models for Improving Supertag Disambiguation , 1999, EACL.

[3]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[4]  James R. Curran,et al.  Investigating GIS and Smoothing for Maximum Entropy Taggers , 2003, EACL.

[5]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[6]  Jason Eisner Efficient Normal-Form Parsing for Combinatory Categorial Grammar , 1996, ACL.

[7]  Miles Osborne,et al.  Estimation of Stochastic Attribute-Value Grammars using an Informative Sample , 2000, COLING.

[8]  Stefan Riezler,et al.  Speed and Accuracy in Shallow and Deep Stochastic Parsing , 2004, NAACL.

[9]  Mark Steedman,et al.  Acquiring Compact Lexicalized Grammars from a Cleaner Treebank , 2002, LREC.

[10]  Srinivas Bangalore,et al.  Reranking an n-gram supertagger , 2002, TAG+.

[11]  James R. Curran,et al.  Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[12]  James R. Curran,et al.  Log-Linear Models for Wide-Coverage CCG Parsing , 2003, EMNLP.

[13]  K. Vijay-Shanker,et al.  Automated Extraction of TAGs from the Penn Treebank , 2000, IWPT.

[14]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[15]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[16]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[17]  Fei Xia,et al.  Some Experiments on Indicators of Parsing Complexity for Lexicalized Grammars , 2000, ELSPS.