Gap-fill Tests for Language Learners: Corpus-Driven Item Generation

Gap-fill exercises have an important role in language teaching. They allow students to demonstrate that they understand vocabulary in context, discouraging memorization of translations. It is timeconsuming and difficult for item writers to create good test items, and even then test items are open to Sinclair’s critique of invented examples. We present a system, TEDDCLOG, which automatically generates draft test items from a corpus. TEDDCLOG takes the key (the word which will form the correct answer to the exercise) as input. It finds distractors (the alternative, wrong answers for the multiplechoice question) from a distributional thesaurus, and identifies a collocate of the key that does not occur with the distractors. Next it finds a simple corpus sentence containing the key and collocate. The system then presents the sentences and distractors to the user for approval, modification or rejection. The system is implemented using the API to the Sketch Engine, a leading corpus query system. We compare TEDDCLOG with other gap-fill-generation systems, and offer a partial evaluation of the

[1]  Chao-Lin Liu,et al.  Using Lexical Constraints to Enhance the Quality of Computer-Generated Multiple-Choice Cloze Items , 2005, Int. J. Comput. Linguistics Chin. Lang. Process..

[2]  John Sinclair,et al.  Looking up : an account of the COBUILD Project in lexical computing and the development of the Collins COBUILD English Language Dictionary , 1987 .

[3]  Wilson L. Taylor,et al.  “Cloze Procedure”: A New Tool for Measuring Readability , 1953 .

[4]  Maxine Eskénazi,et al.  Semi-automatic generation of cloze question distractors effect of students' L1 , 2009, SLaTE.

[5]  Adam Kilgarriff Googleology is Bad Science , 2007, Computational Linguistics.

[6]  Adam Kilgarriff,et al.  GDEX: Automatically Finding Good Dictionary Examples in a Corpus , 2008 .

[7]  David J. Weir,et al.  Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity , 2005, CL.

[8]  Jack Mostow,et al.  Using Automated Questions to Assess Reading Comprehension, Vocabulary, and Effects of Tutorial Interventions , 2004 .

[9]  Michael Heilman,et al.  A Selection Strategy to Improve Cloze Question Quality , 2008 .

[10]  Silvia Bernardini,et al.  Introducing and evaluating ukWaC , a very large web-derived corpus of English , 2008 .

[11]  Hiroshi Nakagawa,et al.  Assisting cloze test making with a web application , 2007 .

[12]  Eiichiro Sumita,et al.  Measuring Non-native Speakers’ Proficiency of English by Using a Test with Automatically-Generated Fill-in-the-Blank Questions , 2005 .