Wide-Coverage Parsing, Semantics, and Morphology

Wide-coverage parsing poses three demands: broad coverage over preferably free text, depth in semantic representation for purposes such as inference in question answering, and computational efficiency. We show for Turkish that these goals are not inherently contradictory when we assign categories to sub-lexical elements in the lexicon. The presumed computational burden of processing such lexicons does not arise when we work with automata-constrained formalisms that are trainable on word-meaning correspondences at the level of predicate-argument structures for any string, which is characteristic of radically lexicalizable grammars. This is helpful in morphologically simpler languages too, where word-based parsing has been shown to benefit from sub-lexical training.

[1]  Julia Hockenmaier,et al.  Creating a CCGbank and a Wide-Coverage CCG Lexicon for German , 2006, ACL.

[2]  James R. Curran,et al.  Partial Training for a Lexicalized-Grammar Parser , 2006, HLT-NAACL.

[3]  C. F. Hockett Two Models of Grammatical Description , 1954 .

[4]  Mark Steedman,et al.  Surface structure and interpretation , 1996, Linguistic inquiry.

[5]  Aslı Göksel Pronominal Participles in Turkish and Lexical Integrity , 2006 .

[6]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[7]  Matthew Honnibal Hat Categories: Representing Form and Function Simultaneously in Combinatory Categorial Grammar , 2010 .

[8]  Cem H. Bozsahin The Combinatory Morphemic Lexicon , 2002, CL.

[9]  Cristina Bosco,et al.  Converting a dependency treebank to a categorial grammar treebank for Italian , 2009 .

[10]  Gary Geunbae Lee,et al.  Korean Combinatory Categorial Grammar and Statistical Parsing , 2002, Comput. Humanit..

[11]  James R. Curran,et al.  Morphological Analysis Can Improve a CCG Parser for English , 2010, COLING.

[12]  Luke S. Zettlemoyer,et al.  Morpho-syntactic Lexical Generalization for CCG Semantic Parsing , 2014, EMNLP.

[13]  R. Lieber Deconstructing Morphology: Word Formation in Syntactic Theory , 1992 .

[14]  Kemal Oflazer,et al.  Erratum: Dependency Parsing of Turkish , 2008, CL.

[15]  Baris Kabak,et al.  Turkish suspended affixation , 2007 .

[16]  Kemal Oflazer Dependency Parsing with an Extended Finite-State Approach , 2003, Computational Linguistics.

[17]  Mark Steedman,et al.  Using CCG categories to improve Hindi dependency parsing , 2013, ACL.

[18]  Mark Steedman,et al.  A* CCG Parsing with a Supertag-factored Model , 2014, EMNLP.

[19]  Kimmo Koskenniemi,et al.  A General Computational Model for Word-Form Recognition and Production , 1984, ACL.

[20]  Mark McConville,et al.  An inheritance-based theory of the lexicon in combinatory categorial grammar , 2008 .

[21]  Ruket Cakici,et al.  Wide-coverage parsing for Turkish , 2009 .

[22]  James R. Curran,et al.  Chinese CCGbank: extracting CCG derivations from the Penn Chinese Treebank , 2010, COLING.

[23]  Dilek Z. Hakkani-Tür,et al.  Building a Turkish Treebank , 2003 .

[24]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[25]  Peter Sells,et al.  Korean and Japanese morphology from a lexical perspective , 1995 .

[26]  P. H. Matthews,et al.  Morphology: An Introduction to the Theory of Word-Structure , 1974 .

[27]  Kimmo Koskenniemi,et al.  A General Computational Model for Word-Form Recognition and Production , 1984 .

[28]  James R. Curran,et al.  Fully Lexicalising CCGbank with Hat Categories , 2009, EMNLP.

[29]  Robert C. Berwick,et al.  Parsing Efficiency, Computational Complexity, and the Evaluation of Grammatical Theories , 2008 .

[30]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[31]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[32]  Iris van Rooij,et al.  The Tractable Cognition Thesis , 2008, Cogn. Sci..

[33]  Brian Roark,et al.  Computational Approaches to Morphology and Syntax , 2007 .

[34]  Kemal Oflazer,et al.  The Annotation Process in the Turkish Treebank , 2003, LINC@EACL.

[35]  Philipp Koehn,et al.  CCG Supertags in Factored Statistical Machine Translation , 2007, WMT@ACL.

[36]  Kenneth Ward Church,et al.  Complexity, Two-Level Morphology and Finnish , 1988, COLING.

[37]  Gregory Stump,et al.  Inflectional Morphology: Conclusions, extensions, and alternatives , 2001 .

[38]  Ruken Cakici Automatic Induction of a CCG Grammar for Turkish , 2005, ACL.

[39]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[40]  Mark Steedman,et al.  Improving Dependency Parsers using Combinatory Categorial Grammar , 2014, EACL.

[41]  L. Valiant Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World , 2013 .

[42]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[43]  Susan F. Schmerling,et al.  Two theories of syntactic categories , 1983 .

[44]  Deniz Yuret,et al.  Learning Morphological Disambiguation Rules for Turkish , 2006, NAACL.

[45]  Robert C. Berwick,et al.  Computational complexity and natural language , 1987 .

[46]  Murat Saraclar,et al.  Resources for Turkish morphological processing , 2011, Lang. Resour. Evaluation.