Distractor Generation with Generative Adversarial Nets for Automatically Creating Fill-in-the-blank Questions

Distractor generation is a crucial step for fill-in-the-blank question generation. We propose a generative model learned from training generative adversarial nets (GANs) to create useful distractors. Our method utilizes only context information and does not use the correct answer, which is completely different from previous Ontology-based or similarity-based approaches. Trained on the Wikipedia corpus, the proposed model is able to predict Wiki entities as distractors. Our method is evaluated on two biology question datasets collected from Wikipedia and actual college-level exams. Experimental results show that our context-based method achieves comparable performance to a frequently used word2vec-based method for the Wiki dataset. In addition, we propose a second-stage learner to combine the strengths of the two methods, which further improves the performance on both datasets, with 51.7% and 48.4% of generated distractors being acceptable.

[1]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[2]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[3]  Rafael E. Banchs,et al.  RevUP: Automatic Gap-Fill Question Generation from Educational Texts , 2015, BEA@NAACL-HLT.

[4]  Jennifer Hill,et al.  Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams , 2016, BEA@NAACL-HLT.

[5]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[6]  Lee Becker,et al.  Mind the Gap: Learning to Choose Gaps for Question Generation , 2012, HLT-NAACL.

[7]  Aniket Kittur,et al.  Questimator: Generating Knowledge Assessments for Arbitrary Topics , 2016, IJCAI.

[8]  Le An Ha,et al.  Generating Multiple-Choice Test Items from Medical Text: A Pilot Study , 2006, INLG.

[9]  Le An Ha,et al.  A computer-aided environment for generating multiple-choice test items , 2006, Natural Language Engineering.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Marti A. Hearst,et al.  Multiple Choice Question Generation Utilizing An Ontology , 2017, BEA@EMNLP.

[12]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[13]  Maxine Eskénazi,et al.  Semi-automatic generation of cloze question distractors effect of students' L1 , 2009, SLaTE.

[14]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[15]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[16]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[17]  Manish Agarwal,et al.  Automatic Gap-fill Question Generation from Text Books , 2011, BEA@ACL.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[21]  Eiichiro Sumita,et al.  Measuring Non-native Speakers’ Proficiency of English by Using a Test with Automatically-Generated Fill-in-the-Blank Questions , 2005 .

[22]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[23]  Shu Jiang,et al.  Distractor Generation for Chinese Fill-in-the-blank Items , 2017, BEA@EMNLP.

[24]  Stephanie Seneff,et al.  Automatic generation of cloze items for prepositions , 2007, INTERSPEECH.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.