论文信息 - Evaluation of automatically generated English vocabulary questions - 字舞流文

Evaluation of automatically generated English vocabulary questions

This paper describes details of the evaluation experiments for questions created by an automatic question generation system. Given a target word and one of its word senses, the system generates a multiple-choice English vocabulary question asking for the closest in meaning to the target word in the reading passage. Two kinds of evaluation were conducted considering two aspects: (1) measuring English learners’ proficiency and (2) their similarity to the human-made questions. The first evaluation is based on the responses from English learners obtained through administering the machine-generated and human-made questions to them, and the second is based on the subjective judgement by English teachers. Both evaluations showed that the machine-generated questions were able to achieve a comparable level with the human-made questions in both measuring English proficiency and similarity.

Takenobu Tokunaga | Hitoshi Nishikawa | Hiroyuki Obari | Yuni Susanti | Hitoshi Nishikawa | T. Tokunaga | H. Obari | Yunik Susanti

[1] Pamela Sharpe. Barron's TOEFL iBT Internet-Based Test 2006-2007 12th Edition with CD-ROM , 2006 .

[2] Kurt VanLehn,et al. How do machine-generated questions compare to human-generated questions? , 2016, Research and Practice in Technology Enhanced Learning.

[3] Eiichiro Sumita,et al. Measuring Non-native Speakers’ Proficiency of English by Using a Test with Automatically-Generated Fill-in-the-Blank Questions , 2005 .

[4] J. B. Heaton,et al. Writing English Language Tests , 1988 .

[5] Jolene Gear and Robert Gear,et al. Cambridge Preparation for the TOEFL Test , 1993 .

[6] Mamoru Komachi,et al. Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners , 2013, ACL.

[7] Glenn Fulcher,et al. The Routledge Handbook of Language Testing , 2013 .

[8] Pamela J. Sharpe. Barron's TOEFL iBT : internet-based test , 2010 .

[9] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[10] Stephanie Seneff,et al. Automatic generation of cloze items for prepositions , 2007, INTERSPEECH.

[11] R. Devellis. Classical Test Theory , 2006, Medical care.

[12] Takenobu Tokunaga,et al. Automatic Generation of English Vocabulary Tests , 2015, CSEDU.

[13] Diana McCarthy. Word Sense Disambiguation: An Overview , 2009, Lang. Linguistics Compass.

[14] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.

[15] 星野綾子,et al. Automatic question generation for language testing and its evaluation criteria , 2009 .

[16] W. Hays. Principles of Educational and Psychological Testing. 3rd ed. , 1984 .

[17] A. M. Turing,et al. Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[18] Yukari Yamakawa,et al. Generating Questions and Multiple-Choice Answers using Semantic Analysis of Texts , 2016, COLING.

[19] F. G. Brown,et al. Principles of educational and psychological testing , 1970 .

[20] Deborah Phillips. Longman Preparation Course for the TOEFL Test: iBT Reading , 1989 .

[21] Ted Pedersen,et al. WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[22] T. L. Kelley. The selection of upper and lower groups for the validation of test items. , 1939 .

[23] Debrorah Phillips,et al. Longman Preparation Course for the TOEFL Test , 1988 .

[24] Maxine Eskénazi,et al. Automatic Question Generation for Vocabulary Assessment , 2005, HLT.

[25] 美国教育考试服务中心. 新托福考试官方指南 = The official guide to the new TOEFL iBT , 2008 .

[26] Sadid A. Hasan,et al. Towards Topic-to-Question Generation , 2015, CL.

[27] Oren Melamud,et al. Automatic Generation of Challenging Distractors Using Context-Sensitive Inference Rules , 2014, BEA@ACL.