Syntactic Dependency Based N-grams in Rule Based Automatic English as Second Language Grammar Correction

In this paper, we present a system for automatic English (L2) grammatical error correction. It participated in ConLL 2013 shared tasks. The system applies a set of simple rules for correction of grammatical errors. In some cases, it uses syntactic n-grams, i.e., n-grams that are constructed in a syntactic metric: namely, by following paths in dependency trees, i.e., there is special procedure that allows obtaining syntactic n-grams. Note that in general case syntactic n-grams permit introducing syntactic information into machine learning methods, because syntactic n-grams have all properties of traditional n-grams. The system is simple, practically does not use additional linguistic resources and was constructed in two months. Due to its simplicity it does not obtain better scores as compared to more sophisticated systems that use many resources, the Internet and machine learning methods, but it can be positioned as a baseline system for the task.

[1]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[2]  Jens Eeg-Olofsson,et al.  Automatic Grammar Checking for Second Language Learners – the Use of Prepositions , 2003 .

[3]  Efstathios Stamatatos,et al.  Syntactic Dependency-Based N-grams as Classification Features , 2012, MICAI.

[4]  Alexander F. Gelbukh,et al.  Web-Based Model for Disambiguation of Prepositional Phrase Usage , 2007, MICAI.

[5]  Alexander Gelbukh Natural Language Processing : Perspective of CIC-IPN Keynote , 2022 .

[6]  Alexander Gelbukh,et al.  Multiword Expressions in NLP: General Survey and a Special Case of Verb-Noun Constructions , 2014 .

[7]  Atro Voutilainen,et al.  Tagging accurately - Don't guess if you know , 1994, ANLP.

[8]  Yasunari Harada,et al.  International Journal of Computational Linguistics and Applicati ons , 2012 .

[9]  I. A. Bolshakov,et al.  Information Theories & Applications " Vol . 10 1 PARONYMS FOR ACCELERATED CORRECTION OF SEMANTIC ERRORS * , 2004 .

[10]  Alexander F. Gelbukh,et al.  Semantic Analysis of Verbal Collocations with Lexical Functions , 2013, Studies in Computational Intelligence.

[11]  Samuel Reese,et al.  FreeLing 2.1: Five Years of Open-source Language Processing Tools , 2010, LREC.

[12]  Alexander F. Gelbukh,et al.  Detection and Correction of Malapropisms in Spanish by Means of Internet Search , 2005, TSD.

[13]  Alexander F. Gelbukh,et al.  Terms Derived from Frequent Sequences for Extractive Text Summarization , 2008, CICLing.

[14]  Alexander F. Gelbukh,et al.  Improving Prepositional Phrase Attachment Disambiguation Using the Web as Corpus , 2003, CIARP.

[15]  Efstathios Stamatatos,et al.  Syntactic N-grams as machine learning features for natural language processing , 2014, Expert Syst. Appl..

[16]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[17]  StamatatosEfstathios,et al.  Syntactic N-grams as machine learning features for natural language processing , 2014 .

[18]  Efstathios Stamatatos,et al.  Syntactic Dependency-Based N-grams: More Evidence of Usefulness in Classification , 2013, CICLing.

[19]  Grigori Sidorov Non-continuous Syntactic N-grams , 2013 .

[20]  Anna Feldman Computational Linguistics: Models, Resources, Applications , 2006, Computational Linguistics.

[21]  Hwee Tou Ng,et al.  The CoNLL-2013 Shared Task on Grammatical Error Correction , 2013, CoNLL Shared Task.

[22]  Alexander Felixovitch Gelbukh Khan,et al.  Transforming a constituency treebank into a dependency treebank , 2005 .

[23]  Hwee Tou Ng,et al.  Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English , 2013, BEA@NAACL-HLT.

[24]  Hwee Tou Ng,et al.  Better Evaluation for Grammatical Error Correction , 2012, NAACL.