论文信息 - Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension

Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension

Multiple-choice Machine Reading Comprehension (MRC) is an important and challenging Natural Language Understanding (NLU) task, in which a machine must choose the answer to a question from a set of choices, with the question placed in context of text passages or dialog. In the last a couple of years the NLU field has been revolutionized with the advent of models based on the Transformer architecture, which are pretrained on massive amounts of unsupervised data and then fine-tuned for various supervised learning NLU tasks. Transformer models have come to dominate a wide variety of leader-boards in the NLU field; in the area of MRC, the current state-of-the-art model on the DREAM dataset (see[Sunet al., 2019]) fine tunes Albert, a large pretrained Transformer-based model, and addition-ally combines it with an extra layer of multi-head attention between context and question-answer[Zhuet al., 2020].The purpose of this note is to document a new state-of-the-art result in the DREAM task, which is accomplished by, additionally, performing multi-task learning on two MRC multi-choice reading comprehension tasks (RACE and DREAM).

Hui Wan

[1] Yoshua Bengio,et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[2] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[3] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[5] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[6] Hai Zhao,et al. Dual Multi-head Co-attention for Multi-choice Reading Comprehension , 2020, ArXiv.

[7] Dilek Z. Hakkani-Tür,et al. MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension , 2020, AAAI.

[8] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[9] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[11] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[12] Claire Cardie,et al. DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension , 2019, TACL.