Semantic Similarity Measures for the Generation of Science Tests in Basque

The work we present in this paper aims to help teachers create multiple-choice science tests. We focus on a scientific vocabulary-learning scenario taking place in a Basque-language educational environment. In this particular scenario, we explore the option of automatically generating Multiple-Choice Questions (MCQ) by means of Natural Language Processing (NLP) techniques and the use of corpora. More specifically, human experts select scientific articles and identify the target terms (i.e., words). These terms are part of the vocabulary studied in the school curriculum for 13-14-year-olds and form the starting point for our system to generate MCQs. We automatically generate distractors that are similar in meaning to the target term. To this end, the system applies semantic similarity measures making use of a variety of corpus-based and graph-based approaches. The paper presents a qualitative and a quantitative analysis of the generated tests to measure the quality of the proposed methods. The qualitative analysis is based on expert opinion, whereas the quantitative analysis is based on the MCQ test responses from students in secondary school. Nine hundred and fifty one students from 18 schools took part in the experiments. The results show that our system could help experts in the generation of MCQ.

[1]  Eiichiro Sumita,et al.  Measuring Non-native Speakers’ Proficiency of English by Using a Test with Automatically-Generated Fill-in-the-Blank Questions , 2005 .

[2]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[3]  Itziar Aduriz,et al.  A Cascaded Syntactic Analyser for Basque , 2004, CICLing.

[4]  Ana Arruarte Lasa,et al.  Memorization and training activities in mobile devices , 2007, Seventh IEEE International Conference on Advanced Learning Technologies (ICALT 2007).

[5]  Manish Agarwal,et al.  Automatic Gap-fill Question Generation from Text Books , 2011, BEA@ACL.

[6]  Claire Gardent,et al.  Generating Grammar Exercises , 2012, BEA@NAACL-HLT.

[7]  Elhuyar Fundazioa,et al.  ZT Corpus Annotation and tools for Basque corpora , .

[8]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[9]  Jennifer Foster,et al.  Quizzes on Tap: Exporting a Test Generation System from One Less-Resourced Language to Another , 2011, LTC.

[10]  Piek Vossen,et al.  The MEANING Multilingual Central Repository , 2004 .

[11]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[12]  Chao-Lin Liu,et al.  Applications of Lexical Information for Algorithmically Composing Multiple-Choice Cloze Items , 2005 .

[13]  Ana Arruarte Lasa,et al.  IKASYS: Using Mobile Devices for Memorization and Training Activities , 2007, EC-TEL.

[14]  Iryna Gurevych,et al.  Wisdom of crowds versus wisdom of linguists – measuring the semantic relatedness of words , 2009, Natural Language Engineering.

[15]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[16]  Eneko Agirre,et al.  Methodology and construction of the Basque WordNet , 2011, Lang. Resour. Evaluation.

[17]  Iraide Zipitria,et al.  Observing Lemmatization Effect in LSA Coherence and Comprehension Grading of Learner Summaries , 2006, Intelligent Tutoring Systems.

[18]  Michael Heilman,et al.  A Selection Strategy to Improve Cloze Question Quality , 2008 .

[19]  Hiroshi Nakagawa,et al.  Assisting cloze test making with a web application , 2007 .

[20]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[21]  Le An Ha,et al.  Semantic Similarity of Distractors in Multiple-Choice Tests: Extrinsic Evaluation , 2009 .

[22]  Adam Kilgarriff,et al.  Automatic Cloze Generation for English Proficiency Testing , 2009 .

[23]  Jason S. Chang,et al.  FAST – An Automatic Generation System for Grammar Tests , 2006, ACL.

[24]  Michael C. Rodriguez,et al.  A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment , 2002 .

[25]  W. Marsden I and J , 2012 .

[26]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[27]  Dominic Widdows,et al.  Discovering Corpus-Specific Word Senses , 2003, EACL.

[28]  Kepa Sarasola,et al.  Semiautomatic Labelling of Semantic Features , 2002, COLING.

[29]  Montse Maritxalar,et al.  ArikIturri: An Automatic Question Generator Based on Corpora and NLP Techniques , 2006, Intelligent Tutoring Systems.

[30]  David Coniam A Preliminary Inquiry into Using Corpus Word Frequency Data in the Automatic Generation of English Language Cloze Tests , 2013 .

[31]  D. Dibattista,et al.  Examination of the Quality of Multiple-choice Items on Classroom Tests , 2011 .

[32]  Ido Dagan,et al.  Similarity-based methods for word sense disambiguation , 1997 .

[33]  M. J. Allen Introduction to Measurement Theory , 1979 .

[34]  Christian Gütl,et al.  Refined Distractor Generation with LSA and Stylometry for Automated Multiple Choice Question Generation , 2012, Australasian Conference on Artificial Intelligence.

[35]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[36]  Stephanie Seneff,et al.  Automatic generation of cloze items for prepositions , 2007, INTERSPEECH.

[37]  Ivelina Nikolova New Issues and Solutions in Computer-aided Design of MCTI and Distractor Selection for Bulgarian , 2009 .

[38]  Erik Duval,et al.  ErauzOnt: A Framework for Gathering Learning Objects from Electronic Documents , 2011, 2011 IEEE 11th International Conference on Advanced Learning Technologies.

[39]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[40]  S. Embretson,et al.  Item response theory for psychologists , 2000 .