Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency. We validate our Information Maximizing Hierarchical Conditional Variational AutoEncoder (Info-HCVAE) on several benchmark datasets by evaluating the performance of the QA model (BERT-base) using only the generated QA pairs (QA-based evaluation) or by using both the generated and human-labeled pairs (semi-supervised learning) for training, against state-of-the-art baseline models. The results show that our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.

[1]  John C. Nesbit,et al.  Generating Natural Language Questions to Support Learning On-Line , 2013, ENLG.

[2]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[3]  Tao Qin,et al.  Question Answering and Question Generation as Dual Tasks , 2017, ArXiv.

[4]  Yun-Nung Chen,et al.  QAInfomax: Learning Robust Question Answering System by Mutual Information Maximization , 2019, EMNLP.

[5]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  Yu Xu,et al.  Learning to Generate Questions by LearningWhat not to Generate , 2019, WWW.

[8]  Bhuwan Dhingra,et al.  Simple and Effective Semi-Supervised Question Answering , 2018, NAACL.

[9]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[10]  Igor Labutov,et al.  Deep Questions without Deep Understanding , 2015, ACL.

[11]  Xuan Wang,et al.  Variational Autoregressive Decoder for Neural Response Generation , 2018, EMNLP.

[12]  Yue Zhang,et al.  Leveraging Context Information for Natural Question Generation , 2018, NAACL.

[13]  Sebastian Riedel,et al.  Evaluating Rewards for Question Generation Models , 2019, NAACL.

[14]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[15]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[16]  Yao-Chung Fan,et al.  A Recurrent BERT-based Model for Question Generation , 2019, EMNLP.

[17]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[18]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[19]  Xinya Du,et al.  Harvesting Paragraph-level Question-Answer Pairs from Wikipedia , 2018, ACL.

[20]  Kyomin Jung,et al.  Improving Neural Question Generation using Answer Separation , 2018, AAAI.

[21]  Alexander M. Rush,et al.  Latent Alignment and Variational Attention , 2018, NeurIPS.

[22]  Ming Zhou,et al.  Neural Question Generation from Text: A Preliminary Study , 2017, NLPCC.

[23]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[24]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[25]  Shan Wu,et al.  Variational Recurrent Neural Machine Translation , 2018, AAAI.

[26]  Gunhee Kim,et al.  A Hierarchical Latent Structure for Variational Conversation Modeling , 2018, NAACL.

[27]  Zhiguo Wang,et al.  A Unified Query-based Generative Model for Question Generation and Question Answering , 2017, ArXiv.

[28]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[29]  Mitesh M. Khapra,et al.  Towards a Better Metric for Evaluating Question Generation Systems , 2018, EMNLP.

[30]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[31]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[34]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[35]  Dongyan Zhao,et al.  Are Training Samples Correlated? Learning to Generate Dialogue Responses with Multiple References , 2019, ACL.

[36]  Michael Collins,et al.  Synthetic QA Corpora Generation with Roundtrip Consistency , 2019, ACL.

[37]  Yanjun Wu,et al.  Teaching Machines to Ask Questions , 2018, IJCAI.

[38]  Ming Zhou,et al.  Learning to Collaborate for Question Answering and Asking , 2018, NAACL.

[39]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[40]  Noah A. Smith,et al.  Good Question! Statistical Ranking for Question Generation , 2010, NAACL.

[41]  Yao Zhao,et al.  Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks , 2018, EMNLP.

[42]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[43]  Furu Wei,et al.  Learning to Ask Unanswerable Questions for Machine Reading Comprehension , 2019, ACL.

[44]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[45]  Yanjun Ma,et al.  Answer-focused and Position-aware Neural Question Generation , 2018, EMNLP.

[46]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[47]  Eric P. Xing,et al.  Self-Training for Jointly Learning to Ask and Answer Questions , 2018, NAACL.

[48]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[49]  Mohit Bansal,et al.  Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering , 2019, EMNLP.

[50]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[51]  Xinya Du,et al.  Learning to Ask: Neural Question Generation for Reading Comprehension , 2017, ACL.

[52]  Eunsol Choi,et al.  MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension , 2019, MRQA@EMNLP.

[53]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[54]  Ganesh Ramakrishnan,et al.  A framework for automatic question generation from text using deep reinforcement learning , 2018, ArXiv.

[55]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[56]  Philip Bachman,et al.  Machine Comprehension by Text-to-Text Neural Question Generation , 2017, Rep4NLP@ACL.

[57]  Ruslan Salakhutdinov,et al.  Semi-Supervised QA with Generative Domain-Adaptive Nets , 2017, ACL.

[58]  Eduard Hovy,et al.  Manual and automatic evaluation of summaries , 2002, ACL 2002.

[59]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.