A framework for automatic question generation from text using deep reinforcement learning

Automatic question generation (QG) is a useful yet challenging task in NLP. Recent neural network-based approaches represent the state-of-the-art in this task, but they are not without shortcomings. Firstly, these models lack the ability to handle rare words and the word repetition problem. Moreover, all previous works optimize the cross-entropy loss, which can induce inconsistencies between training (objective) and testing (evaluation measure). In this paper, we present a novel deep reinforcement learning based framework for automatic question generation. The generator of the framework is a sequence-to-sequence model, enhanced with the copy mechanism to handle the rare-words problem and the coverage mechanism to solve the word repetition problem. The evaluator model of the framework evaluates and assigns a reward to each predicted question. The overall model is trained by learning the parameters of the generator network which maximizes the reward. Our framework allows us to directly optimize any task-specific score including evaluation measures such as BLEU, GLEU, ROUGE-L, {\em etc.}, suitable for sequence to sequence tasks such as QG. Our comprehensive evaluation shows that our approach significantly outperforms state-of-the-art systems on the widely-used SQuAD benchmark in both automatic and human evaluation.

[1]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[2]  Noah A. Smith,et al.  Automatic factual question generation from text , 2011 .

[3]  Xinya Du,et al.  Learning to Ask: Neural Question Generation for Reading Comprehension , 2017, ACL.

[4]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[5]  Jack Mostow,et al.  Generating Instruction Automatically for the Reading Strategy of Self-Questioning , 2009, AIED.

[6]  Rashmi Prasad,et al.  Question Generation from Paragraphs at UPenn: QGSTEC System Description , 2010 .

[7]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[8]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[9]  Mohammed Elmogy,et al.  Automatic English Question Generation System Based on Template Driven Scheme , 2014 .

[10]  Colleen E. Crangle,et al.  A questions-based investigation of consumer mental-health information , 2015, PeerJ.

[11]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Ganesh Ramakrishnan,et al.  Automating Reading Comprehension by Generating Question and Answer Pairs , 2018, PAKDD.

[14]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[15]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[16]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[17]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[18]  Justus J. Randolph Free-Marginal Multirater Kappa (multirater K[free]): An Alternative to Fleiss' Fixed-Marginal Multirater Kappa. , 2005 .

[19]  Yoshua Bengio,et al.  Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus , 2016, ACL.

[20]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[21]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.