Transition-based combinatory categorial grammar parsing for English and Hindi

Given a natural language sentence, parsing is the task of assigning it a grammatical structure, according to the rules within a particular grammar formalism. Different grammar formalisms like Dependency Grammar, Phrase Structure Grammar, Combinatory Categorial Grammar, Tree Adjoining Grammar are explored in the literature for parsing. For example, given a sentence like “John ate an apple”, parsers based on the widely used dependency grammars find grammatical relations, such as that ‘John’ is the subject and ‘apple’ is the object of the action ‘ate’. We mainly focus on Combinatory Categorial Grammar (CCG) in this thesis. In this thesis, we present an incremental algorithm for parsing CCG for two diverse languages: English and Hindi. English is a fixed word order, SVO (Subject-VerbObject), and morphologically simple language, whereas, Hindi, though predominantly a SOV (Subject-Object-Verb) language, is a free word order and morphologically rich language. Developing an incremental parser for Hindi is really challenging since the predicate needed to resolve dependencies comes at the end. As previously available shift-reduce CCG parsers use English CCGbank derivations which are mostly right branching and non-incremental, we design our algorithm based on the dependencies resolved rather than the derivation. Our novel algorithm builds a dependency graph in parallel to the CCG derivation which is used for revealing the unbuilt structure without backtracking. Though we use dependencies for meaning representation and CCG for parsing, our revealing technique can be applied to other meaning representations like lambda expressions and for non-CCG parsing like phrase structure parsing. Any statistical parser requires three major modules: data, parsing algorithm and learning algorithm. This thesis is broadly divided into three parts each dealing with one major module of the statistical parser. In Part I, we design a novel algorithm for converting dependency treebank to CCGbank. We create Hindi CCGbank with a decent coverage of 96% using this algorithm. We also do a cross-formalism experiment where we show that CCG supertags can improve widely used dependency parsers. We experiment with two popular dependency parsers (Malt and MST) for two diverse languages: English and Hindi. For both languages, CCG categories improve the overall accuracy of both parsers by around 0.3-0.5% in all experiments. For both parsers, we see larger improvements specifically on dependencies at which they are known to be weak: long distance dependencies for Malt, and verbal arguments for MST. The result is particularly interesting in the case of the fast greedy parser (Malt), since

[1]  Stephen Clark,et al.  Shift-Reduce CCG Parsing with a Dependency Model , 2014, ACL.

[2]  Eric P. Xing,et al.  Concise Integer Linear Programming Formulations for Dependency Parsing , 2009, ACL.

[3]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[4]  Yuji Matsumoto,et al.  Annotating a Japanese Text Corpus with Predicate-Argument and Coreference Relations , 2007, LAW@ACL.

[5]  Akshar Bharati,et al.  Natural language processing : a Paninian perspective , 1996 .

[6]  Sambhav Jain,et al.  Two Methods to Incorporate ’Local Morphosyntactic’ Features in Hindi Dependency Parsing , 2010, SPMRL@NAACL-HLT.

[7]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[8]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[9]  Lijun Feng,et al.  Automatic Readability Assessment , 2010 .

[10]  Luke S. Zettlemoyer,et al.  Joint A* CCG Parsing and Semantic Role Labelling , 2015, EMNLP.

[11]  Eunsol Choi,et al.  Scaling Semantic Parsers with On-the-Fly Ontology Matching , 2013, EMNLP.

[12]  WILLIAM MARSLEN-WILSON,et al.  Linguistic Structure and Speech Shadowing at Very Short Latencies , 1973, Nature.

[13]  Andy Way,et al.  Lexicalized Semi-incremental Dependency Parsing , 2009, RANLP.

[14]  Frank Keller,et al.  Data from eye-tracking corpora as evidence for theories of syntactic processing complexity , 2008, Cognition.

[15]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[16]  Sowmya Vajjala Balakrishna,et al.  Analyzing Text Complexity and Text Simplification: Connecting Linguistics, Processing and Educational Applications , 2015 .

[17]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[18]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[19]  Hideki Mima,et al.  Integrating Multiple Dependency Corpora for Inducing Wide-coverage Japanese CCG Resources , 2013, ACL.

[20]  Chris Callison-Burch,et al.  Incremental Syntactic Language Models for Phrase-based Translation , 2011, ACL.

[21]  Jason Eisner Efficient Normal-Form Parsing for Combinatory Categorial Grammar , 1996, ACL.

[22]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[23]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[24]  Dipti Misra Sharma,et al.  A Karaka Based Annotation Scheme for English , 2009, CICLing.

[25]  Ralph Grishman,et al.  The NomBank Project: An Interim Report , 2004, FCP@NAACL-HLT.

[26]  Mark Steedman,et al.  A* CCG Parsing with a Supertag-factored Model , 2014, EMNLP.

[27]  Cristina Bosco,et al.  Converting a dependency treebank to a categorial grammar treebank for Italian , 2009 .

[28]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[29]  James R. Curran,et al.  Improving Combinatory Categorial Grammar Parse Reranking with Dependency Grammar Features , 2012, COLING.

[30]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[31]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[32]  Stephen Clark,et al.  CCG Supertagging with a Recurrent Neural Network , 2015, ACL.

[33]  Akshar Bharati,et al.  Insights into Non-projectivity in Hindi , 2009, ACL.

[34]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[35]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[36]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[37]  Michael Wilson,et al.  MRC psycholinguistic database: Machine-usable dictionary, version 2.00 , 1988 .

[38]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[39]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[40]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[41]  James R. Curran,et al.  Investigating GIS and Smoothing for Maximum Entropy Taggers , 2003, EACL.

[42]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[43]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[44]  J. Curran,et al.  Improving the complement / adjunct distinction in CCGbank , 2007 .

[45]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[46]  Ruken Cakici,et al.  Automatic Induction of a CCG Grammar for Turkish , 2005, ACL.

[47]  Gérard P. Huet,et al.  A Unification Algorithm for Typed lambda-Calculus , 1975, Theor. Comput. Sci..

[48]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[49]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[50]  Jason Baldridge,et al.  Lexically specified derivational control in combinatory categorial grammar , 2002 .

[51]  James R. Curran,et al.  Chinese CCGbank: extracting CCG derivations from the Penn Chinese Treebank , 2010, COLING.

[52]  Rico Sennrich,et al.  Modelling and Optimizing on Syntactic N-Grams for Statistical Machine Translation , 2015, TACL.

[53]  Mark Steedman,et al.  Transforming Dependency Structures to Logical Forms for Semantic Parsing , 2016, TACL.

[54]  Mark Steedman,et al.  Hindi CCGbank: A CCG treebank from the Hindi dependency treebank , 2017, Language Resources and Evaluation.

[55]  Fei Xia,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[56]  James R. Curran,et al.  Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[57]  Bharat Ram Ambati,et al.  Two semantic features make all the difference in Parsing accuracy , 2008 .

[58]  Mark Steedman,et al.  Wide-Coverage Semantic Representations from a CCG Parser , 2004, COLING.

[59]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[60]  Dipti Misra Sharma,et al.  Developing Verb Frames for Hindi , 2008, LREC.

[61]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[62]  Mark Steedman,et al.  Large-scale Semantic Parsing without Question-Answer Pairs , 2014, TACL.

[63]  Mark Steedman,et al.  Improving Dependency Parsers using Combinatory Categorial Grammar , 2014, EACL.

[64]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[65]  Slav Petrov,et al.  Improved Transition-Based Parsing and Tagging with Neural Networks , 2015, EMNLP.

[66]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[67]  Frank Keller,et al.  Incremental Tree Substitution Grammar for Parsing and Sentence Prediction , 2013, TACL.

[68]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[69]  Mark Steedman,et al.  Combined Distributional and Logical Semantics , 2013, TACL.

[70]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[71]  Joel R. Tetreault,et al.  Incremental Parsing with Reference Interaction , 2004 .

[72]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[73]  Julia Hockenmaier,et al.  Creating a CCGbank and a Wide-Coverage CCG Lexicon for German , 2006, ACL.

[74]  James R. Curran,et al.  Partial Training for a Lexicalized-Grammar Parser , 2006, HLT-NAACL.

[75]  Dipti Misra Sharma,et al.  Two stage constraint based hybrid approach to free word order language dependency parsing , 2009, IWPT.

[76]  Chris Quirk,et al.  Machine Translation , 1972, HLT.

[77]  E. Hovy,et al.  A Fast , Effective , Non-Projective , Semantically-Enriched Parser , 2011 .

[78]  Johan Bos,et al.  Rebanking CCGbank for Improved NP Interpretation , 2010, ACL.

[79]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[80]  Stephen Clark,et al.  Supertagging for Combinatory Categorial Grammar , 2002, TAG+.

[81]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[82]  Vera Demberg,et al.  Incremental Derivations in CCG , 2012, TAG.

[83]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[84]  Joakim Nivre,et al.  A Transition-Based Parser for 2-Planar Dependency Structures , 2010, ACL.

[85]  M. Brysbaert,et al.  Age-of-acquisition ratings for 30,000 English words , 2012, Behavior research methods.

[86]  Mark Steedman,et al.  Assessing Relative Sentence Complexity using an Incremental CCG Parser , 2016, NAACL.

[87]  Joakim Nivre,et al.  Pseudo-Projective Dependency Parsing , 2005, ACL.

[88]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[89]  Dipti Misra Sharma,et al.  Dependency Annotation Scheme for Indian Languages , 2008, IJCNLP.

[90]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[91]  Ronan Collobert,et al.  Deep Learning for Efficient Discriminative Parsing , 2011, AISTATS.

[92]  Mari Ostendorf,et al.  A machine learning approach to reading level assessment , 2009, Comput. Speech Lang..

[93]  Wolfgang Menzel,et al.  Guiding a Constraint Dependency Parser with Supertags , 2006, ACL.

[94]  Michael A. Covington,et al.  A Fundamental Algorithm for Dependency Parsing , 2004 .

[95]  Chris Callison-Burch,et al.  Problems in Current Text Simplification Research: New Data Can Help , 2015, TACL.

[96]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[97]  Mark Johnson,et al.  A Non-Monotonic Arc-Eager Transition System for Dependency Parsing , 2013, CoNLL.

[98]  Joel Nothman,et al.  Evaluating a Statistical CCG Parser on Wikipedia , 2009, PWNLP@IJCNLP.

[99]  Julie C. Sedivy,et al.  Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[100]  Mark Steedman,et al.  Using CCG categories to improve Hindi dependency parsing , 2013, ACL.

[101]  Jane J. Robinson Dependency Structures and Transformational Rules , 1970 .

[102]  Walt Detmar Meurers,et al.  Assessing the relative reading level of sentence pairs for text simplification , 2014, EACL.

[103]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[104]  Hao Zhang,et al.  Online Learning for Inexact Hypergraph Search , 2013, EMNLP.

[105]  Sabine Brants,et al.  The TIGER Treebank , 2001 .

[106]  Wei Wu,et al.  Aligning Sentences from Standard Wikipedia to Simple Wikipedia , 2015, NAACL.

[107]  Fernando Pereira,et al.  Discriminative learning and spanning tree algorithms for dependency parsing , 2006 .

[108]  Mark Steedman,et al.  Improved CCG Parsing with Semi-supervised Supertagging , 2014, TACL.

[109]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[110]  Prashanth Mannem,et al.  The ICON-2010 tools contest on Indian language dependency parsing , 2010 .

[111]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[112]  Gerald Penn,et al.  Accurate Context-Free Parsing with Combinatory Categorial Grammar , 2010, ACL.

[113]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[114]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[115]  James R. Curran,et al.  The Challenges of Parsing Chinese with Combinatory Categorial Grammar , 2012, HLT-NAACL.

[116]  Joakim Nivre,et al.  Analyzing the Effect of Global Learning and Beam-Search on Transition-Based Dependency Parsing , 2012, COLING.

[117]  Giorgio Satta,et al.  A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy , 2013, ACL.

[118]  Taro Watanabe,et al.  Transition-based Neural Constituent Parsing , 2015, ACL.

[119]  Yuji Matsumoto,et al.  Improving Dependency Parsers with Supertags , 2014, EACL.

[120]  Fei Xia,et al.  A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu , 2009, Linguistic Annotation Workshop.

[121]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[122]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[123]  Walt Detmar Meurers,et al.  Readability Classification for German using Lexical, Syntactic, and Morphological Features , 2012, COLING.

[124]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[125]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[126]  Joakim Nivre,et al.  On the Role of Morphosyntactic Features in Hindi Dependency Parsing , 2010, SPMRL@NAACL-HLT.

[127]  Dependency Parsers for Indian Languages , 2009 .

[128]  Jun'ichi Tsujii,et al.  HPSG Parsing with Shallow Dependency Constraints , 2007, ACL.

[129]  Mark Steedman,et al.  Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification , 2010, EMNLP.

[130]  Mark Steedman,et al.  Building Deep Dependency Structures using a Wide-Coverage CCG Parser , 2002, ACL.

[131]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[132]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[133]  Stephen Clark,et al.  Syntactic Processing Using the Generalized Perceptron and Beam Search , 2011, CL.

[134]  Anoop Mahajan,et al.  Relative Asymmetries and Hindi Correlatives , 2000 .

[135]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[136]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[137]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[138]  Mark Steedman,et al.  A Lazy way to Chart-Parse with Categorial Grammars , 1987, ACL.

[139]  Josef van Genabith,et al.  From News to Comment: Resources and Benchmarks for Parsing the Language of Web 2.0 , 2011, IJCNLP.

[140]  David M. Magerman Natural Language Parsing as Statistical Pattern Recognition , 1994, ArXiv.

[141]  Emiel Krahmer,et al.  Sentence Simplification by Monolingual Machine Translation , 2012, ACL.

[142]  Kôiti Hasida,et al.  Construction of a Japanese Relevance-tagged Corpus , 2002, LREC.

[143]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[144]  Mark Steedman,et al.  Interaction with context during human sentence processing , 1988, Cognition.

[145]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[146]  Joakim Nivre,et al.  Dependency Parsing , 2009, Lang. Linguistics Compass.

[147]  Yue Zhang,et al.  A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing , 2015, ACL.

[148]  Stuart M. Shieber,et al.  Ellipsis and higher-order unification , 1991 .

[149]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[150]  Advaith Siddharthan,et al.  Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules , 2014, EACL.

[151]  Ruket Cakici,et al.  Wide-coverage parsing for Turkish , 2009 .

[152]  Wen Wang,et al.  Language modeling using a statistical dependency grammar parser , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[153]  Mark Steedman,et al.  Unsupervised Induction of Cross-Lingual Semantic Relations , 2013, EMNLP.

[154]  B. Venkata Seshu Kumari,et al.  Improving Telugu Dependency Parsing using Combinatory Categorial Grammar Supertags , 2015, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[155]  Dipti Misra Sharma,et al.  Improving Data Driven Dependency Parsing using Clausal Information , 2010, NAACL.

[156]  Mirella Lapata,et al.  WikiSimple: Automatic Simplification of Wikipedia Articles , 2011, AAAI.

[157]  Stephen Clark,et al.  Shift-Reduce CCG Parsing , 2011, ACL.

[158]  Brian Roark,et al.  Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.

[159]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[160]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[161]  Avinesh Pvs,et al.  Part-Of-Speech Tagging and Chunking using Conditional Random Fields and Transformation Based Learning , 2006 .

[162]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[163]  Mark Steedman,et al.  Shift-Reduce CCG Parsing using Neural Network Models , 2016, HLT-NAACL.

[164]  Dipti Misra Sharma,et al.  AnnCorra : Annotating Corpora Guidelines For POS And Chunk Annotation For Indian Languages , 2008 .

[165]  Mark Steedman,et al.  The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing , 2013, ACL.

[166]  Adam Lopez,et al.  A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing , 2011, ACL.

[167]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[168]  J. F. Staal,et al.  Syntactic and Semantic Relations in Pāṇini , 1969 .

[169]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[170]  Slav Petrov,et al.  Training a Parser for Machine Translation Reordering , 2011, EMNLP.

[171]  R. H. Baayen,et al.  The CELEX Lexical Database (CD-ROM) , 1996 .

[172]  T. Mohanan Argument structure in Hindi , 1994 .

[173]  Joakim Nivre,et al.  A Dynamic Oracle for Arc-Eager Dependency Parsing , 2012, COLING.