Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags

We show that Combinatory Categorial Grammar (CCG) supertags can improve Telugu dependency parsing. In this process, we first extract a CCG lexicon from the dependency treebank. Using both the CCG lexicon and the dependency treebank, we create a CCG treebank using a chart parser. Exploring different morphological features of Telugu, we develop a supertagger using maximum entropy models. We provide CCG supertags as features to the Telugu dependency parser (MST parser). We get an improvement of 1.8% in the unlabelled attachment score and 2.2% in the labelled attachment score. Our results show that CCG supertags improve the MST parser, especially on verbal arguments for which it has weak rates of recovery.

[1]  Stephen Clark,et al.  Shift-Reduce CCG Parsing , 2011, ACL.

[2]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[3]  Julia Hockenmaier,et al.  Creating a CCGbank and a Wide-Coverage CCG Lexicon for German , 2006, ACL.

[4]  James R. Curran,et al.  Chinese CCGbank: extracting CCG derivations from the Penn Chinese Treebank , 2010, COLING.

[5]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[6]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[7]  Akshar Bharati,et al.  Natural language processing : a Paninian perspective , 1996 .

[8]  Sambhav Jain,et al.  Two Methods to Incorporate ’Local Morphosyntactic’ Features in Hindi Dependency Parsing , 2010, SPMRL@NAACL-HLT.

[9]  Dipti Misra Sharma,et al.  AnnCorra : Annotating Corpora Guidelines For POS And Chunk Annotation For Indian Languages , 2008 .

[10]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[11]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[12]  Cristina Bosco,et al.  Converting a dependency treebank to a categorial grammar treebank for Italian , 2009 .

[13]  James R. Curran,et al.  Improving Combinatory Categorial Grammar Parse Reranking with Dependency Grammar Features , 2012, COLING.

[14]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[15]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[16]  Fernando Pereira,et al.  Discriminative learning and spanning tree algorithms for dependency parsing , 2006 .

[17]  Brendan S. Gillon Review of Natural language processing: a Paninian perspective by Akshar Bharati, Vineet Chaitanya, and Rajeev Sangal. Prentice-Hall of India 1995. , 1995 .

[18]  Prashanth Mannem,et al.  The ICON-2010 tools contest on Indian language dependency parsing , 2010 .

[19]  Dependency Parsers for Indian Languages , 2009 .

[20]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[21]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[22]  Ruken Cakici,et al.  Automatic Induction of a CCG Grammar for Turkish , 2005, ACL.

[23]  Samar Husain,et al.  A Two Stage Constraint Based Hybrid Dependency Parser for Telugu , 2010 .

[24]  Mark Steedman,et al.  Using CCG categories to improve Hindi dependency parsing , 2013, ACL.

[25]  Hao Zhang,et al.  Online Learning for Inexact Hypergraph Search , 2013, EMNLP.

[26]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.