Unsupervised Adaptation of Question Answering Systems via Generative Self-training

BERT-era question answering systems have recently achieved impressive performance on several question-answering (QA) tasks. These systems are based on representations that have been pre-trained on self-supervised tasks such as word masking and sentence entailment, using massive amounts of data. Nevertheless, additional pre-training closer to the end-task, such as training on synthetic QA pairs, has been shown to improve performance. While recent work has considered augmenting labelled data and leveraging large unlabelled datasets to generate synthetic QA data, directly adapting to target data has received little attention. In this paper we investigate the iterative generation of synthetic QA pairs as a way to realize unsupervised self adaptation. Motivated by the success of the roundtrip consistency method for filtering generated QA pairs, we present iterative generalizations of the approach, which maximize an approximation of a lower bound on the probability of the adaptation data. By adapting on synthetic QA pairs generated on the target data, our method is able to improve QA systems significantly, using an order of magnitude less synthetic data and training computation than existing augmentation approaches.

[1]  Xinya Du,et al.  Learning to Ask: Neural Question Generation for Reading Comprehension , 2017, ACL.

[2]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[3]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[4]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[5]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[6]  Mike Lewis,et al.  Generative Question Answering: Learning to Answer the Whole Question , 2018, ICLR.

[7]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[8]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[9]  J. Lucke Truncated Variational Expectation Maximization , 2016, 1610.03113.

[10]  Eric P. Xing,et al.  Self-Training for Jointly Learning to Ask and Answer Questions , 2018, NAACL.

[11]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[12]  Chris Callison-Burch,et al.  Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription , 2010, NAACL.

[13]  Ming Zhou,et al.  Question Generation for Question Answering , 2017, EMNLP.

[14]  Motoaki Kawanabe,et al.  Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.

[15]  Zhiguo Wang,et al.  A Unified Query-based Generative Model for Question Generation and Question Answering , 2017, ArXiv.

[16]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[17]  Tao Qin,et al.  Question Answering and Question Generation as Dual Tasks , 2017, ArXiv.

[18]  Xiaodong Liu,et al.  Adversarial Domain Adaptation for Machine Reading Comprehension , 2019, EMNLP.

[19]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[20]  Michael Collins,et al.  Synthetic QA Corpora Generation with Roundtrip Consistency , 2019, ACL.

[21]  Yoshua Bengio,et al.  Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[22]  Jiajun Shen,et al.  Revisiting Self-Training for Neural Sequence Generation , 2020, ICLR.

[23]  Yee Whye Teh,et al.  Tighter Variational Bounds are Not Necessarily Better , 2018, ICML.

[24]  Mohit Bansal,et al.  Addressing Semantic Drift in Question Generation for Semi-Supervised Question Answering , 2019, EMNLP.