Fully Lexicalising CCGbank with Hat Categories

We introduce an extension to CCG that allows form and function to be represented simultaneously, reducing the proliferation of modifier categories seen in standard CCG analyses. We can then remove the non-combinatory rules CCGbank uses to address this problem, producing a grammar that is fully lexicalised and far less ambiguous. There are intrinsic benefits to full lexicalisation, such as semantic transparency and simpler domain adaptation. The clearest advantage is a 52--88% improvement in parse speeds, which comes with only a small reduction in accuracy.

[1]  Mark Steedman,et al.  Combinatory Categorial Grammar , 2011 .

[2]  Jason Baldridge,et al.  Multi-Modal Combinatory Categorial Grammar , 2003, EACL.

[3]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[4]  Jason Eisner Efficient Normal-Form Parsing for Combinatory Categorial Grammar , 1996, ACL.

[5]  Aravind K. Joshi Explorations of a domain of locality: Lexicalized Tree-Adjoining Grammar , 1999, CLIN.

[6]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[7]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[8]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[9]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[10]  Bob Carpenter,et al.  Categorial Grammars, Lexical Rules and the English Predicative , 1995 .

[11]  Kent Wittenburg,et al.  Zero Morphemes in Unification-Based Combinatory Categorial Grammar , 1990, ACL.

[12]  Michael Moortgat,et al.  Categorial Type Logics , 1997, Handbook of Logic and Language.

[13]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[14]  D. Flickinger Lexical rules in the hierarchical lexicon , 1989 .

[15]  W. John Hutchins Yehoshua Bar-Hillel , 2000 .

[16]  Mark Steedman,et al.  Acquiring Compact Lexicalized Grammars from a Cleaner Treebank , 2002, LREC.

[17]  Y. Bar-Hillel A Quasi-Arithmetical Notation for Syntactic Description , 1953 .