Performance Evaluation of a Novel Technique for Word Order Errors Correction Applied to Non Native English Speakers' Corpus

This work presents the evaluation results of a novel technique for word order errors correction, using non native English speakers' corpus. This technique, which is language independent, repairs word order errors in sentences using the probabilities of most typical trigrams and bigrams extracted from a large text corpus such as the British National Corpus (BNC). A good indicator of whether a person really knows a language is the ability to use the appropriate words in a sentence in correct word order. The "scrambled" words in a sentence produce a meaningless sentence. Most languages have a firly fixed word order. For non-native speakers and writers, word order errors are more frequent in English as a Second Language. These errors come from the student if he is translating (thinking in his/her native language and trying to translate it into English). For this reason, the experimentation task involves a test set of 50 sentences translated from Greek to English. The purpose of this experiment is to determine how the system performs on real data, produced by non native English speakers.

[1]  Hitoshi Isahara,et al.  Automatic Error Detection in the Japanese Learners’ English Spoken Data , 2003, ACL.

[2]  Stephanie Seneff,et al.  Automatic grammar correction for second-language learners , 2006, INTERSPEECH.

[3]  Timothy Baldwin,et al.  Arboretum: Using a precision grammar for grammar checking in CALL , 2004 .

[4]  Lisa N. Michaud,et al.  An intelligent tutoring system for deaf learners of written English , 2000, Assets '00.

[5]  Trude Heift INTELLIGENT LANGUAGE TUTORING SYSTEMS FOR GRAMMAR PRACTICE , 2001 .

[6]  H. Kopetz Automatic Error Detection , 1976 .

[7]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[8]  Joaquim Moré,et al.  A grammar checker based on web searching , 2006 .

[9]  Eric Atwell,et al.  How to Detect Grammatical Errors in a Text Without Parsing It , 1987, EACL.

[10]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[11]  Ming Zhou,et al.  Detecting Erroneous Sentences using Automatically Mined Sequential Patterns , 2007, ACL.

[12]  Johnny Bigert Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge , 2002 .

[13]  Marta R. Costa-jussà,et al.  An Ngram-based reordering model , 2009, Comput. Speech Lang..

[14]  Martin Chodorow,et al.  An Unsupervised Method for Detecting Grammatical Errors , 2000, ANLP.

[15]  Steve Young,et al.  A review of large-vocabulary continuous-speech , 1996, IEEE Signal Process. Mag..

[16]  Ola Knutsson,et al.  Faking Errors to Avoid Making Errors: Very Weakly Supervised Learning for Error Detection in Writing , 2005 .

[17]  Robin Cooper,et al.  Robust Chart Parsing with Mildly Inconsistent Feature Structures , 1994 .

[18]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[19]  Frederik Fouvry,et al.  Constraint relaxation with weighted feature structures , 2003, IWPT.