Detecting and Diagnosing Grammatical Errors for Beginning Learners of German: From Learner Corpus Annotation to Constraint Satisfaction Problems.

This thesis presents a corpus of beginning learner German with a reliable error annotation scheme and an approach for detecting and diagnosing grammatical errors in learner language. A constraint-based dependency parser provides the foundation for a flexible and modular analysis of German by representing parsing as a constraint satisfaction problem. The grammar checker, Fledgling, detects and diagnoses errors using constraint relaxation with a general-purpose conflict detection algorithm. Fledgling is developed and evaluated using authentic learner productions from the learner corpus. It judges grammaticality correctly for 80% of sentences and is 82–91% accurate in determining whether a sentence contains selection, agreement, or word order errors.

[1]  Ingo Schröder,et al.  Natural language parsing with graded constraints , 2002 .

[2]  Martin Chodorow,et al.  Native Judgments of Non-Native Usage: Experiments in Preposition Error Detection , 2008, COLING 2008.

[3]  Manuel Bodirsky,et al.  Equivalence Constraint Satisfaction Problems , 2012, CSL.

[4]  Marga Reis On Justifying Topological Frames : 'Positional Field' and the Order of Nonverbal Constituents in German ⁰ , 1980 .

[5]  Rina Dechter,et al.  Tree Clustering for Constraint Networks , 1989, Artif. Intell..

[6]  Eugene C. Freuder,et al.  Partial Constraint Satisfaction , 1989, IJCAI.

[7]  Hilke Dreyer,et al.  A practice Grammar of German : die Gelbe aktuell , 2010 .

[8]  Margaret Rogers,et al.  ON MAJOR TYPES OF WRITTEN ERROR IN ADVANCED STUDENTS OF GERMAN , 1984 .

[9]  Gert Smolka,et al.  Finite Domain Constraint Programming in Oz. A Tutorial , 1998 .

[10]  Patrick Prosser,et al.  HYBRID ALGORITHMS FOR THE CONSTRAINT SATISFACTION PROBLEM , 1993, Comput. Intell..

[11]  Noriko Nagata,et al.  Robo-Sensei's NLP-Based Error Detection and Feedback Generation , 2013 .

[12]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[13]  青木 昌吉 Deutsche Grammatik = 獨逸小文典 , 1921 .

[14]  Kathrin Beck,et al.  Stylebook for the Tubingen Treebank of Written German (TuBa-D/Z) , 2012 .

[15]  Roman Barták,et al.  Constraint Processing , 2009, Encyclopedia of Artificial Intelligence.

[16]  Jianfeng Gao,et al.  Using Contextual Speller Techniques and Language Modeling for ESL Error Correction , 2008, IJCNLP.

[17]  Markus Dickinson,et al.  Dependency annotation of coordination for learner language , 2014 .

[18]  Ruth H. Sanders,et al.  Designing and Implementing a Syntactic Parser , 2013 .

[19]  Ruth Sanders,et al.  Error Analysis in Purely Syntactic Parsing of Free Input , 2013, CALICO Journal.

[20]  Walt Detmar Meurers,et al.  Compiling HPSG Type Constraints into Definite Clause Programs , 1995, ACL.

[21]  Hans Uszkoreit,et al.  Word Order and Constituent Structure in German , 1987, CSLI Lecture Notes.

[22]  Patrick Grommes,et al.  Mehrdeutigkeiten und Kategorisierung: Probleme bei der Annotation von Lernerkorpora , 2008 .

[23]  Kathleen F. McCoy,et al.  Recognizing Syntactic Errors in the Writing of Second Language Learners , 1998, ACL.

[24]  Timothy Osborne,et al.  Major constituents and two dependency grammar constraints on sharing in coordination , 2008 .

[25]  Ralph Debusmann,et al.  Topological Dependency Trees: A Constraint-Based Account of Linear Precedence , 2001, ACL.

[26]  Nizar Habash,et al.  Inter-annotator Agreement on a Multilingual Semantic Annotation Task , 2006, LREC.

[27]  Manfred Klenner Tutorial Dialogue in DiBEx , 2004 .

[28]  Mark W. Perlin,et al.  Arc consistency for factorable relations , 1991, [Proceedings] Third International Conference on Tools for Artificial Intelligence - TAI 91.

[29]  Narendra Jussien E-constraints: Explanation-based constraint programming , 2001 .

[30]  Sylviane Granger,et al.  Error-tagged learner corpora and CALL: a promising synergy , 2003 .

[31]  John Cocke,et al.  Programming languages and their compilers , 1969 .

[32]  Stephen Pulman,et al.  Automatic error detection in non-native English , 2008 .

[33]  Zhu Zhang,et al.  Extraposition: A Case Study in German Sentence Realization , 2002, COLING.

[34]  Gerald J. Sussman,et al.  Forward Reasoning and Dependency-Directed Backtracking in a System for Computer-Aided Circuit Analysis , 1976, Artif. Intell..

[35]  Peter Siemen,et al.  FALKO - Ein fehlerannotiertes Lernerkorpus des Deutschen , 2006 .

[36]  Peter J. Stuckey,et al.  MiniZinc: Towards a Standard CP Modelling Language , 2007, CP.

[37]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[38]  P. M. Wognum,et al.  Diagnosing and Solving Over-Determined Constraint Satisfaction Problems , 1993, IJCAI.

[39]  Norman K. Sondheimer,et al.  Meta-Rules as a Basis for Processing III-Formed Input , 1983, Am. J. Comput. Linguistics.

[40]  Adriane Boyd,et al.  EAGLE: an Error-Annotated Corpus of Beginning Learner German , 2010, LREC.

[41]  Hans Uszkoreit Entwicklung einer lexikographischen Datenbank fdie Verben des Deutschen , 1996 .

[42]  Krzysztof R. Apt,et al.  Principles of constraint programming , 2003 .

[43]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[44]  Wolfgang Menzel,et al.  Error Diagnosis for Language Learning Systems , 1999 .

[45]  Dan Flickinger,et al.  On building a more effcient grammar by exploiting types , 2000, Natural Language Engineering.

[46]  Miriam Butt,et al.  A grammar writer's cookbook , 1999 .

[47]  Susan Harroff,et al.  DIE SPRACHMASCHINE: A MICROWORLD FOR LANGUAGE EXPERIMENTATION , 2013, CALICO Journal.

[48]  Stefanie Dipper,et al.  Implementing and documenting large scale grammars: German LFG , 2003 .

[49]  John Sie Yuen Lee Automatic correction of grammatical errors in non-native English text , 2009 .

[50]  Thomas C. Schmidt The transcription system EXMARaLDA: An application of the annotation graph formalism as the basis of a database of multilingual spoken discourse , 2001 .

[51]  Jirí Havelka Beyond Projectivity: Multilingual Evaluation of Constraints and Measures on Non-Projective Structures , 2007, ACL.

[52]  Patrice Boizumault,et al.  Maintaining Arc-Consistency within Dynamic Backtracking , 2000, CP.

[53]  Christian Bessiere,et al.  Arc-Consistency in Dynamic Constraint Satisfaction Problems , 1991, AAAI.

[54]  Manfred Klenner,et al.  What exactly is wrong and why? Tutorial Dialogue for Intelligent CALL Systems , 2013 .

[55]  Kilian A. Foth,et al.  Robust parsing with weighted constraints , 2005, Natural Language Engineering.

[56]  J. Gaschnig Performance measurement and analysis of certain search algorithms. , 1979 .

[57]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[58]  Martin Forst,et al.  An LFG Grammar Checker for CALL , 2004 .

[59]  Noriko Nagata,et al.  Intelligent Computer Feedback for Second Language Instruction , 1993 .

[60]  Vipin Kumar,et al.  Algorithms for Constraint-Satisfaction Problems: A Survey , 1992, AI Mag..

[61]  Wolfgang Lezius Ims Morphy -- German Morphology, Part-of-Speech Tagging and Applications , 2000 .

[62]  Michael Gasser Toward Synchronous Extensible Dependency Grammar , 2011 .

[63]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[64]  Walt Detmar Meurers,et al.  Exploring the Data-Driven Prediction of Prepositions in English , 2010, COLING.

[65]  Thomas C. Henderson,et al.  Arc and Path Consistency Revisited , 1986, Artif. Intell..

[66]  Joachim Niehren,et al.  The XDG Grammar Development Kit , 2004, MOZ.

[67]  Pascal Van Hentenryck,et al.  A Generic Arc-Consistency Algorithm and its Specializations , 1992, Artif. Intell..

[68]  Walt Detmar Meurers,et al.  On the Automatic Analysis of Learner Language: Introduction to the Special Issue , 2013 .

[69]  Anke Lüdeling,et al.  Multi-level error annotation in learner corpora , 2005 .

[70]  Cornelia Tschichold,et al.  Lexically Driven Error Detection and Correction , 2013, CALICO Journal.

[71]  Walt Detmar Meurers,et al.  On using intelligent computer-assisted language learning in real-life foreign language teaching and learning , 2011, ReCALL.

[72]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[73]  Adriane Boyd,et al.  Discontinuity Revisited: An Improved Conversion to Context-Free Representations , 2007, LAW@ACL.

[74]  A. Weinberg,et al.  A Principle-based Parser for Foreign Language Training in German and Arabic , 1993, IWPT.

[75]  P. Mcfetridge,et al.  Designed intelligence: a language teacher model , 1998 .

[76]  Timothy Baldwin,et al.  Arboretum: Using a precision grammar for grammar checking in CALL , 2004 .

[77]  Sylvana Sofkova Hashemi Automatic Detection of Grammar Errors in Primary School Children´s Texts. A Finite Approach , 2003 .

[78]  P. Eisenberg Grundriss der deutschen Grammatik , 2006 .

[79]  Christian Bessiere,et al.  Using Constraint Metaknowledge to Reduce Arc Consistency Computation , 1999, Artif. Intell..

[80]  Ruth H. Sanders,et al.  History of an AI Spy Game: Spion , 2013 .

[81]  Ivan E. Sutherland,et al.  Sketchpad: a man-machine graphical communication system , 1899, AFIPS '63 (Spring).

[82]  Roman Barták,et al.  Constraint Programming: In Pursuit of the Holy Grail , 1999 .

[83]  Colin Cherry,et al.  Fast and Accurate Arc Filtering for Dependency Parsing , 2010, COLING.

[84]  Tsuneaki Kato,et al.  Yet Another Chart-Based Technique for Parsing Ill-Formed Input , 1994, ANLP.

[85]  Lisa N. Michaud,et al.  Error Profiling: Toward a Model of English Acquisition for Deaf Learners , 2001, ACL.

[86]  Trude Heift Multiple Learner Errors and Meaningful Feedback: A Challenge for ICALL Systems , 2013 .

[87]  Trude Heift,et al.  Developing an Intelligent Language Tutor , 2010 .

[88]  Matthew L. Ginsberg,et al.  Dynamic Backtracking , 1993, J. Artif. Intell. Res..

[89]  Alan K. Mackworth Consistency in Networks of Relations , 1977, Artif. Intell..

[90]  Romuald Debruyne,et al.  Arc-consistency in dynamic CSPs is no more prohibitive , 1996, Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence.

[91]  Camilla Schwind Sensitive parsing: error analysis and explanation in an intelligent language tutoring system , 1988, COLING.

[92]  Sabine Brants,et al.  The TIGER Treebank , 2001 .

[93]  Mary Grantham O'Brien,et al.  Errors and intelligence in computer-assisted language learning: Parsers and pedagogues (review) , 2009 .

[94]  David L. Waltz,et al.  Generating Semantic Descriptions From Drawings of Scenes With Shadows , 1972 .

[95]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[96]  Joakim Nivre,et al.  Dependency Grammar and Dependency Parsing , 2005 .

[97]  Edward P. K. Tsang,et al.  Foundations of constraint satisfaction , 1993, Computation in cognitive science.

[98]  Wolfgang Menzel Modellbasierte Fehlerdiagnose in Sprachlehrsystemen , 1992 .

[99]  Eugene C. Freuder,et al.  The Complexity of Some Polynomial Network Consistency Algorithms for Constraint Satisfaction Problems , 1985, Artif. Intell..

[100]  Chris Mellish,et al.  Some Chart-Based Techniques for Parsing Ill-Formed Input , 1989, ACL.

[101]  Mark Johnson,et al.  Two ways of formalizing grammars , 1994 .

[102]  Robert Di Donato,et al.  Deutsch: Na Klar! An Introductory German Course , 1995 .

[103]  Camilla Schwind Error Analysis and Explanation in Knowledge Based Language Tutoring. , 1995 .

[104]  Ralph Debusmann,et al.  Extensible dependency grammar: a modular grammar formalism based on multigraph description , 2006 .

[105]  Veit Reuer Error Recognition and Feedback with Lexical Functional Grammar , 2003 .

[106]  Kathleen F. McCoy,et al.  English Error Correction: A Syntactic User Model Based on Principled “Mal-Rule” Scoring , 1996 .

[107]  Ralph Debusmann,et al.  A declarative grammar formalism for dependency grammar , 2001 .

[108]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[109]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[110]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[111]  Narendra Jussien,et al.  The PaLM system: explanation-based constraint programming , 2000 .

[112]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[113]  Alexander Koller,et al.  Statistical A-star dependency parsing , 2003 .

[114]  Robert M. Haralick,et al.  Increasing Tree Search Efficiency for Constraint Satisfaction Problems , 1979, Artif. Intell..

[115]  Wolfgang Menzel,et al.  Constraint-based Diagnosis for Intelligent Language Tutoring Systems , 1998 .

[116]  Christian Schulte,et al.  Programming Constraint Inference Engines , 1997, CP.

[117]  Christian Bessiere,et al.  Arc-Consistency and Arc-Consistency Again , 1993, Artif. Intell..

[118]  Randy Goebel,et al.  Web-Scale N-gram Models for Lexical Disambiguation , 2009, IJCAI.

[119]  Stuart M. Shieber,et al.  Unification and Grammatical Theory , 1986 .

[120]  Guido Tack IOzSeF Integrated Oz Search Factory , 2002 .

[121]  Sylvain Kahane,et al.  Word Order in German: A Formal Dependency Grammar Using a Topological Hierarchy , 2001, ACL.

[122]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[123]  Ronald M. Kaplan,et al.  Lexical Functional Grammar A Formal System for Grammatical Representation , 2004 .

[124]  Ulrich Junker,et al.  QUICKXPLAIN: Preferred Explanations and Relaxations for Over-Constrained Problems , 2004, AAAI.

[125]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[126]  Stuart M. Shieber,et al.  An Introduction to Unification-Based Approaches to Grammar , 1986, CSLI Lecture Notes.

[127]  Ulrich Junker Conflict Detection for Arbitrary Constraint Propagation Algorithms , 2001 .

[128]  Hiroshi Maruyama,et al.  Structural Disambiguation With Constraint Propagation , 1990, ACL.

[129]  Mark Davies,et al.  The 400 million word Corpus of Historical American English (1810–2009) , 2012 .

[130]  Anne Rimrott,et al.  SPELL CHECKING IN COMPUTER-ASSISTED LANGUAGE LEARNING: A STUDY OF MISSPELLINGS BY NONNATIVE WRITERS OF GERMAN , 2005 .

[131]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[132]  Norbert Bröker,et al.  Separating Surface Order and Syntactic Relations in a Dependency Grammar , 1998, COLING-ACL.

[133]  Guido Tack,et al.  Constraint propagation: models, techniques, implementation , 2009 .

[134]  Gert Smolka The Oz Programming Model , 1996 .

[135]  Wolfgang Menzel,et al.  A broad-coverage parser for German based on defeasible constraints , 2008 .

[136]  V. Melissa Holland,et al.  Intelligent Language Tutors : Theory Shaping Technology , 2013 .

[137]  Vilius Juozulynas Errors in the Compositions of Second-Year German Students: An Empirical Study for Parser-Based ICALI , 1994 .

[138]  Claudia Maienborn,et al.  Verbs of Motion and Position: On the Optionality of the Local Argument , 1991, Text Understanding in LILOG.

[139]  Xavier Lorca,et al.  Choco: an Open Source Java Constraint Programming Library , 2008 .