MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension

Span extraction is an essential problem in machine reading comprehension. Most of the existing algorithms predict the start and end positions of an answer span in the given corresponding context by generating two probability vectors. In this paper, we propose a novel approach that extends the probability vector to a probability matrix. Such a matrix can cover more start-end position pairs. Precisely, to each possible start index, the method always generates an end probability vector. Besides, we propose a sampling-based training strategy to address the computational cost and memory issue in the matrix training phase. We evaluate our method on SQuAD 1.1 and three other question answering benchmarks. Leveraging the most competitive models BERT and BiDAF as the backbone, our proposed approach can get consistent improvements in all datasets, demonstrating the effectiveness of the proposed method.

[1]  Weiming Zhang,et al.  Neural Machine Reading Comprehension: Methods and Trends , 2019, Applied Sciences.

[2]  Hwee Tou Ng,et al.  A Nil-Aware Answer Extraction Framework for Question Answering , 2018, EMNLP.

[3]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[4]  Jun Xu,et al.  HAS-QA: Hierarchical Answer Spans Model for Open-domain Question Answering , 2019, AAAI.

[5]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[6]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[7]  Chenguang Zhu,et al.  SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering , 2018, ArXiv.

[8]  Kenton Lee,et al.  Learning Recurrent Span Representations for Extractive Question Answering , 2016, ArXiv.

[9]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[10]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[11]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[12]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[13]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[16]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Richard Socher,et al.  Unifying Question Answering and Text Classification via Span Extraction , 2019, ArXiv.

[19]  Quoc V. Le,et al.  QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[20]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[21]  Hwee Tou Ng,et al.  A Question-Focused Multi-Factor Attention Network for Question Answering , 2018, AAAI.

[22]  Lijing Song,et al.  Contextual Aware Joint Probability Model Towards Question Answering System , 2019, ArXiv.

[23]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[24]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[25]  Kaixuan Li,et al.  First-principle study on honeycomb fluorated-InTe monolayer with large Rashba spin splitting and direct bandgap , 2019, Applications of Surface Science.

[26]  Ming Zhou,et al.  S-Net: From Answer Extraction to Answer Generation for Machine Reading Comprehension , 2017, AAAI 2017.

[27]  Richard Socher,et al.  Unifying Question Answering, Text Classification, and Regression via Span Extraction , 2019 .

[28]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .