Error-tagged learner corpora and CALL: a promising synergy

Learner corpora--electronic collections of foreign or second language learner data--constitute a new resource for second language acquisition (SLA) and foreign language teaching (FLT) specialists. They are especially useful when they are error-tagged, that is, when all errors in the corpus have been annotated with the help of a standardized system of error tags. This article describes the three-tiered error annotation system designed to annotate the French Interlanguage Database (FRIDA) corpus. The research took place within the framework of the FreeText project which aims to produce a learner corpus-informed CALL program for French as a Foreign Language. Once annotated, the FRIDA corpus was put through standard text retrieval software to extract detailed error statistics and to carry out concordance-based analyses of specific error types. The results were used to focus the CALL exercises on learners' attested difficulties and to improve the error diagnosis system integrated in the CALL program.