Improving Dependency Parsers using Combinatory Categorial Grammar

Subcategorization information is a useful feature in dependency parsing. In this paper, we explore a method of incorporating this information via Combinatory Categorial Grammar (CCG) categories from a supertagger. We experiment with two popular dependency parsers (Malt and MST) for two languages: English and Hindi. For both languages, CCG categories improve the overall accuracy of both parsers by around 0.3-0.5% in all experiments. For both parsers, we see larger improvements specifically on dependencies at which they are known to be weak: long distance dependencies for Malt, and verbal arguments for MST. The result is particularly interesting in the case of the fast greedy parser (Malt), since improving its accuracy without significantly compromising speed is relevant for large scale applications such as parsing the web.

[1]  Jun'ichi Tsujii,et al.  HPSG Parsing with Shallow Dependency Constraints , 2007, ACL.

[2]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[3]  Hideki Mima,et al.  Integrating Multiple Dependency Corpora for Inducing Wide-coverage Japanese CCG Resources , 2013, ACL.

[4]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[5]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[6]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[7]  Joakim Nivre,et al.  Analyzing the Effect of Global Learning and Beam-Search on Transition-Based Dependency Parsing , 2012, COLING.

[8]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[9]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[10]  Mark Steedman,et al.  Using CCG categories to improve Hindi dependency parsing , 2013, ACL.

[11]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[12]  Ruken Cakici,et al.  Automatic Induction of a CCG Grammar for Turkish , 2005, ACL.

[13]  James R. Curran,et al.  Improving Combinatory Categorial Grammar Parse Reranking with Dependency Grammar Features , 2012, COLING.

[14]  Mark Steedman,et al.  The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing , 2013, ACL.

[15]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[16]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[17]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[18]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[19]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[20]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[21]  Noah A. Smith,et al.  Dependency Parsing , 2009, Encyclopedia of Artificial Intelligence.