Defining Syntax for Learner Language Annotation

We discuss making syntactic annotation for learner language more precise, by clarifying the properties which the layers of annotation refer to. Building from previous proposals which split linguistic annotation into multiple layers to capture non-canonical properties of learner language, we lay out the questions which must be asked for grammatical annotation and provide some answers. Our investigation points to the layer of distributional syntax being based on properties of the target language (L2) and largely redundant with the other layers. We show, for example, that subcategorization seems to better be able to underspecify annotation for situations where no single correct solution can be found. While this paves the way for applying the annotation to larger corpus efforts, it also represents a significant step in elucidating syntax for non-canonical language.

[1]  Edward W. D. Whittaker,et al.  Creating a manually error-tagged and shallow-parsed learner corpus , 2011, ACL.

[2]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[3]  Sylviane Granger,et al.  Error-tagged learner corpora and CALL: a promising synergy , 2003 .

[4]  Paula Buttery,et al.  Criterial Features in Learner Corpora: Theory and Illustrations , 2010 .

[5]  Dan Roth,et al.  Annotating ESL Errors: Challenges and Rewards , 2010 .

[6]  Alon Lavie,et al.  High-accuracy Annotation and Parsing of CHILDES Transcripts , 2007 .

[7]  Yan Xiao,et al.  Second Language Acquisition: An Introductory Course , 2014 .

[8]  Walt Detmar Meurers,et al.  Comparing Rule-Based and Data-Driven Dependency Parsing of Learner Language , 2011, DepLing.

[9]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[10]  Markus Dickinson,et al.  Modifying Corpus Annotation to Support the Analysis of Learner Language , 2013 .

[11]  Anke Lüdeling,et al.  Multi-level error annotation in learner corpora , 2005 .

[12]  A. Lavie,et al.  Morphosyntactic annotation of CHILDES transcripts. , 2010, Journal of child language.

[13]  Stefano Rastelli Learner Corpora without Error Tagging , 2013 .

[14]  P.J.M. de Haan,et al.  Tagging non-native English with the TOSCA-ICLE tagger , 2000, Corpus Linguistics and Linguistic Theory.

[15]  J. Bresnan Lexical-Functional Syntax , 2000 .

[16]  Martin Wynne,et al.  Developing Linguistic Corpora: a Guide to Good Practice , 2005 .

[17]  Owen Rambow The Simple Truth about Dependency and Phrase Structure Representations: An Opinion Piece , 2010, HLT-NAACL.

[18]  Bertus van Rooy,et al.  The effect of learner errors on POS tag errors during automatic POS tagging , 2002 .

[19]  Markus Dickinson,et al.  Dependency Annotation for Learner Corpora , 2009 .

[20]  Markus Dickinson,et al.  Avoiding the Comparative Fallacy inthe Annotation of Learner Corpora , 2011 .

[21]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[22]  Markus Dickinson,et al.  Dependency annotation of coordination for learner language , 2014 .

[23]  Rod Ellis,et al.  The Study of Second Language Acquisition , 1994 .

[24]  Ines Rehbein,et al.  Better tags give better trees – or do they? , 2011 .

[25]  Jennifer Foster,et al.  Using Parse Features for Preposition Selection and Error Detection , 2010, ACL.

[26]  Manfred Pienemann,et al.  COALA-A computational system for interlanguage analysis , 1992 .

[27]  Niels Ott,et al.  Evaluating Dependency Parsing Performance on German Learner Language , 2010 .

[28]  Geoffrey Leech,et al.  Adding linguistic annotation. , 2005 .

[29]  Sylvie Thouësny Increasing the reliability of a part-of-speech tagging tool for use with learner language , 2009 .

[30]  Walt Detmar Meurers,et al.  Towards interlanguage POS annotation for effective learner corpora in SLA and FLT , 2009 .