Efficient CCG Parsing: A* versus Adaptive Supertagging

We present a systematic comparison and combination of two orthogonal techniques for efficient parsing of Combinatory Categorial Grammar (CCG). First we consider adaptive supertagging, a widely used approximate search technique that prunes most lexical categories from the parser's search space using a separate sequence model. Next we consider several variants on A*, a classic exact search technique which to our knowledge has not been applied to more expressive grammar formalisms like CCG. In addition to standard hardware-independent measures of parser effort we also present what we believe is the first evaluation of A* parsing on the more realistic but more stringent metric of CPU time. By itself, A* substantially reduces parser effort as measured by the number of edges considered during parsing, but we show that for CCG this does not always correspond to improvements in CPU time over a CKY baseline. Combining A* with adaptive supertagging decreases CPU time by 15% for our best model.

[1]  David A. McAllester,et al.  The Generalized A* Architecture , 2007, J. Artif. Intell. Res..

[2]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[3]  Dan Klein,et al.  K-Best A* Parsing , 2009, ACL.

[4]  Dan Klein,et al.  Hierarchical Search for Parsing , 2009, HLT-NAACL.

[5]  Gerth Stølting Brodal,et al.  Worst-case efficient priority queues , 1996, SODA '96.

[6]  James R. Curran,et al.  Improving the Efficiency of a Wide-Coverage CCG Parser , 2007, IWPT.

[7]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[8]  Dan Klein,et al.  A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[9]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[10]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[11]  Donald E. Knuth,et al.  A Generalization of Dijkstra's Algorithm , 1977, Inf. Process. Lett..

[12]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[13]  Dan Klein,et al.  Parsing and Hypergraphs , 2001, IWPT.

[14]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[15]  James R. Curran,et al.  Chart Pruning for Fast Lexicalised-Grammar Parsing , 2010, COLING.

[16]  Brian Roark,et al.  Linear Complexity Context-Free Parsing Pipelines via Chart Constraints , 2009, NAACL.

[17]  James R. Curran,et al.  Faster Parsing by Supertagger Adaptation , 2010, ACL.

[18]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[19]  Stephen Clark,et al.  Supertagging for Combinatory Categorial Grammar , 2002, TAG+.

[20]  Gerald Penn,et al.  Accurate Context-Free Parsing with Combinatory Categorial Grammar , 2010, ACL.