Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer. State-of-the-art models tend to use a pretrained transformer model (e.g., BERT) to encode the joint contextual information of document and question. However, these transformer-based models can only take a fixed-length (e.g., 512) text as its input. To deal with even longer text inputs, previous approaches usually chunk them into equally-spaced segments and predict answers based on each segment independently without considering the information from other segments. As a result, they may form segments that fail to cover the correct answer span or retain insufficient contexts around it, which significantly degrades the performance. Moreover, they are less capable of answering questions that need cross-segment information. We propose to let a model learn to chunk in a more flexible way via reinforcement learning: a model can decide the next segment that it wants to process in either direction. We also employ recurrent mechanisms to enable information to flow across segments. Experiments on three MRC datasets -- CoQA, QuAC, and TriviaQA -- demonstrate the effectiveness of our proposed recurrent chunking mechanisms: we can obtain segments that are more likely to contain complete answers and at the same time provide sufficient contexts around the ground truth answers for better predictions.

[1]  Matthew R. Hallowell,et al.  Automatically Learning Construction Injury Precursors from Text , 2019, Automation in Construction.

[2]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[3]  Mark Yatskar,et al.  A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC , 2018, NAACL.

[4]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[5]  Guillaume Bouchard,et al.  Interpretation of Natural Language Rules in Conversational Machine Reading , 2018, EMNLP.

[6]  Ali Farhadi,et al.  Neural Speed Reading via Skim-RNN , 2017, ICLR.

[7]  Samuel R. Bowman,et al.  Training a Ranking Function for Open-Domain Question Answering , 2018, NAACL.

[8]  Richard Socher,et al.  Unifying Question Answering and Text Classification via Span Extraction , 2019, ArXiv.

[9]  Richard Socher,et al.  Efficient and Robust Question Answering from Minimal Context over Documents , 2018, ACL.

[10]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[11]  Chris Dyer,et al.  The NarrativeQA Reading Comprehension Challenge , 2017, TACL.

[12]  Zhiyuan Liu,et al.  Denoising Distantly Supervised Open-Domain Question Answering , 2018, ACL.

[13]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[14]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[15]  Quoc V. Le,et al.  Learning to Skim Text , 2017, ACL.

[16]  Jinjun Xiong,et al.  Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus , 2019, NAACL.

[17]  Alexandre Lacoste,et al.  Accurate Supervised and Semi-Supervised Machine Reading for Long Documents , 2017, EMNLP.

[18]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[21]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[22]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[23]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Christopher Clark,et al.  Simple and Effective Multi-Paragraph Reading Comprehension , 2017, ACL.

[26]  Jimmy J. Lin,et al.  End-to-End Open-Domain Question Answering with BERTserini , 2019, NAACL.

[27]  Ion Androutsopoulos,et al.  Neural Legal Judgment Prediction in English , 2019, ACL.

[28]  Eunsol Choi,et al.  QuAC: Question Answering in Context , 2018, EMNLP.

[29]  Furu Wei,et al.  Read + Verify: Machine Reading Comprehension with Unanswerable Questions , 2018, AAAI.

[30]  Philip S. Yu,et al.  Review Conversational Reading Comprehension , 2019, ArXiv.

[31]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[32]  Xiaodong Liu,et al.  ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension , 2018, ArXiv.

[33]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.