Dual Head-wise Coattention Network for Machine Comprehension with Multiple-Choice Questions

Multiple-choice Machine Comprehension (MC) is an important and challenging nature language processing (NLP) task where the machine is required to make the best answer from candidate answer set given particular passage and question. Existing approaches either only utilize the powerful pre-trained language models or only rely on an over complicated matching network that is design supposed to capture the relationship effectively among the triplet of passage, question and candidate answers. In this paper, we present a novel architecture, Dual Head-wise Coattention network (called DHC), which is a simple and efficient attention neural network designed to perform multiple-choice MC task. Our proposed DHC not only support a powerful pre-trained language model as encoder, but also models the MC relationship as attention mechanism straightforwardly, by head-wise matching and aggregating method on multiple layers, which better model relationships sufficiently between question and passage, and cooperate with large pre-trained language models more efficiently. To evaluate the performance, we test our proposed model on five challenging and well-known datasets for multiple-choice MC: RACE, DREAM, SemEval-2018 Task 11, OpenBookQA, and TOEFL. Extensive experimental results demonstrate that our proposal can achieve a significant increase in accuracy comparing existing models based on all five datasets, and it consistently outperforms all tested baselines including the state-of-the-arts techniques. More remarkably, our proposal is a pluggable and more flexible model, and it thus can be plugged into any pre-trained Language Models based on BERT. Ablation studies demonstrate its state-of-the-art performance and generalization.

[1]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[2]  Hai Zhao,et al.  Dual Co-Matching Network for Multi-choice Reading Comprehension , 2020, AAAI.

[3]  Hui Wan Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension , 2020, ArXiv.

[4]  Wentao Ma,et al.  Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions , 2018, AAAI.

[5]  Dilek Z. Hakkani-Tür,et al.  MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension , 2020, AAAI.

[6]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[7]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[8]  Jun Zhao,et al.  FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining , 2020, IJCAI.

[9]  Claire Cardie,et al.  DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension , 2019, TACL.

[10]  Mohammad Shoeybi,et al.  Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.

[11]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[12]  Shiyu Chang,et al.  A Co-Matching Model for Multi-choice Reading Comprehension , 2018, ACL.

[13]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[14]  Heng Ji,et al.  Improving Question Answering with External Knowledge , 2019, EMNLP.

[15]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[16]  Claire Cardie,et al.  Improving Machine Reading Comprehension with General Reading Strategies , 2018, NAACL.

[17]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[18]  Mitesh M. Khapra,et al.  ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions , 2018, IJCAI.

[19]  Simon Ostermann,et al.  SemEval-2018 Task 11: Machine Comprehension Using Commonsense Knowledge , 2018, *SEMEVAL.

[20]  Xiaodong Liu,et al.  Towards Human-level Machine Reading Comprehension: Reasoning and Inference with Multiple Strategies , 2017, ArXiv.

[21]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[22]  Min Tang,et al.  Multi-Matching Network for Multiple Choice Reading Comprehension , 2019, AAAI.

[23]  Alex Wang,et al.  Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling , 2018, ArXiv.

[24]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25]  Peng Li,et al.  Option Comparison Network for Multiple-choice Reading Comprehension , 2019, ArXiv.

[26]  Jing Zhang,et al.  DIM Reader: Dual Interaction Model for Machine Comprehension , 2017, CCL.

[27]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[28]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[29]  Hai Zhao,et al.  Dual Multi-head Co-attention for Multi-choice Reading Comprehension , 2020, ArXiv.

[30]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[31]  Vaishali Ingale,et al.  GenNet : Reading Comprehension with Multiple Choice Questions using Generation and Selection model , 2020, ArXiv.

[32]  Bo Jin,et al.  Unified Generative Adversarial Networks for Multiple-Choice Oriented Machine Comprehension , 2020, ACM Trans. Intell. Syst. Technol..

[33]  Peter Clark,et al.  Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.

[34]  Furu Wei,et al.  Hierarchical Attention Flow for Multiple-Choice Reading Comprehension , 2018, AAAI.

[35]  Daniel Khashabi,et al.  UnifiedQA: Crossing Format Boundaries With a Single QA System , 2020, EMNLP.

[36]  Lin-Shan Lee,et al.  Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine , 2016, INTERSPEECH.

[37]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[38]  Jun Zhao,et al.  Semantics-Reinforced Networks for Question Generation , 2020, ECAI.