Crowdsourcing Multiple Choice Science Questions

We present a novel method for obtaining high-quality, domain-targeted multiple choice questions from crowd workers. Generating these questions can be difficult without trading away originality, relevance or diversity in the answer options. Our method addresses these problems by leveraging a large corpus of domain-specific text and a small set of existing questions. It produces model suggestions for document selection and answer distractor choice which aid the human question generation process. With this method we have assembled SciQ, a dataset of 13.7K multiple choice science exam questions (Dataset available at this http URL). We demonstrate that the method produces in-domain questions by providing an analysis of this new dataset and by showing that humans cannot distinguish the crowdsourced questions from original questions. When using SciQ as additional training data to existing questions, we observe accuracy improvements on real science exams.

[1]  David Berthelot,et al.  WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia , 2016, ACL.

[2]  Eiichiro Sumita,et al.  Measuring Non-native Speakers’ Proficiency of English by Using a Test with Automatically-Generated Fill-in-the-Blank Questions , 2005 .

[3]  Michael Heilman,et al.  A Selection Strategy to Improve Cloze Question Quality , 2008 .

[4]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[5]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[6]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[7]  Mamoru Komachi,et al.  Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners , 2013, ACL.

[8]  Oren Etzioni,et al.  Exploring Markov Logic Networks for Question Answering , 2015, EMNLP.

[9]  Sandro Pezzelle,et al.  The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.

[10]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[11]  Oren Melamud,et al.  Automatic Generation of Challenging Distractors Using Context-Sensitive Inference Rules , 2014, BEA@ACL.

[12]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[13]  Le An Ha,et al.  Semantic Similarity of Distractors in Multiple-Choice Tests: Extrinsic Evaluation , 2009 .

[14]  Philip Bachman,et al.  Iterative Alternating Neural Attention for Machine Reading , 2016, ArXiv.

[15]  Montse Maritxalar,et al.  Automatic Distractor Generation for Domain Specific Texts , 2010, IceTAL.

[16]  Oren Etzioni,et al.  Moving beyond the Turing Test with the Allen AI Science Challenge , 2016, Commun. ACM.

[17]  Rudolf Kadlec,et al.  Text Understanding with the Attention Sum Reader Network , 2016, ACL.

[18]  Oren Etzioni,et al.  Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions , 2016, AAAI.

[19]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[20]  Maxine Eskénazi,et al.  Semi-automatic generation of cloze question distractors effect of students' L1 , 2009, SLaTE.

[21]  I. Trancoso,et al.  Automatic Generation of Cloze Question Distractors , 2010 .

[22]  David A. McAllester,et al.  Who did What: A Large-Scale Person-Centered Cloze Dataset , 2016, EMNLP.

[23]  Noah A. Smith,et al.  Good Question! Statistical Ranking for Question Generation , 2010, NAACL.

[24]  Jack Mostow,et al.  Generating Diagnostic Multiple Choice Comprehension Cloze Questions , 2012, BEA@NAACL-HLT.

[25]  Peter Clark,et al.  A study of the knowledge base requirements for passing an elementary science test , 2013, AKBC '13.

[26]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[27]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[28]  Yang Li,et al.  Answering Elementary Science Questions by Constructing Coherent Scenes using Background Knowledge , 2015, EMNLP.

[29]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[30]  Eric P. Xing,et al.  Science Question Answering using Instructional Materials , 2016, ACL.

[31]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[32]  Andreas Papasalouros,et al.  Automatic Generation Of Multiple Choice Questions From Domain Ontologies , 2008, e-Learning.

[33]  Oren Etzioni,et al.  Question Answering via Integer Programming over Semi-Structured Knowledge , 2016, IJCAI.

[34]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[35]  Manish Agarwal,et al.  Automatic Gap-fill Question Generation from Text Books , 2011, BEA@ACL.

[36]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[37]  Peter Clark Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge! , 2015, AAAI.

[38]  Jason Weston,et al.  Large-scale Simple Question Answering with Memory Networks , 2015, ArXiv.