Meta Answering for Machine Reading

We investigate a framework for machine reading, inspired by real world information-seeking problems, where a meta question answering system interacts with a black box environment. The environment encapsulates a competitive machine reader based on BERT, providing candidate answers to questions, and possibly some context. To validate the realism of our formulation, we ask humans to play the role of a meta-answerer. With just a small snippet of text around an answer, humans can outperform the machine reader, improving recall. Similarly, a simple machine meta-answerer outperforms the environment, improving both precision and recall on the Natural Questions dataset. The system relies on joint training of answer scoring and the selection of conditioning information.

[1]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[2]  Jannis Bulian,et al.  Ask the Right Questions: Active Question Reformulation with Reinforcement Learning , 2017, ICLR.

[3]  Eric Wallace,et al.  Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering , 2018, TACL.

[4]  Hong Xie,et al.  Patterns between interactive intentions and information-seeking strategies , 2002, Inf. Process. Manag..

[5]  Daniel M. Russell,et al.  The Joy of Search: A Google Insider's Guide to Going Beyond the Basics , 2019 .

[6]  Diego Mollá Aliod,et al.  Extracting Exact Answers using a Meta Question Answering System , 2005, ALTA.

[7]  Yan Wu,et al.  Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.

[8]  Sung Ju Hwang,et al.  Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data , 2019, ACL.

[9]  Amanda Spink,et al.  Patterns of query reformulation during Web searching , 2009 .

[10]  Hung-Yu Kao,et al.  Probing Neural Network Comprehension of Natural Language Arguments , 2019, ACL.

[11]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[12]  Kenton Lee,et al.  A BERT Baseline for the Natural Questions , 2019, ArXiv.

[13]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[14]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[15]  Efthimis N. Efthimiadis,et al.  Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.

[16]  Ming-Wei Chang,et al.  Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[17]  Nigel Ford,et al.  How do children reformulate their search queries? , 2015, Inf. Res..

[18]  Ellen M. Voorhees,et al.  Evaluating the Evaluation: A Case Study Using the TREC 2002 Question Answering Track , 2003, NAACL.

[19]  Ellen M. Voorhees,et al.  Building a question answering test collection , 2000, SIGIR '00.

[20]  Mohit Bansal,et al.  Revealing the Importance of Semantic Retrieval for Machine Reading at Scale , 2019, EMNLP.

[21]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[22]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[23]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[24]  Kyunghyun Cho,et al.  Task-Oriented Query Reformulation with Reinforcement Learning , 2017, EMNLP.

[25]  Andreas Vlachos,et al.  The Fact Extraction and VERification (FEVER) Shared Task , 2018, FEVER@EMNLP.

[26]  Christopher Joseph Pal,et al.  Interactive Machine Comprehension with Information Seeking Agents , 2020, ACL.

[27]  Marc G. Bellemare,et al.  An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents , 2018, IJCAI.

[28]  Michael Collins,et al.  Synthetic QA Corpora Generation with Roundtrip Consistency , 2019, ACL.

[29]  Nicolas Le Roux,et al.  A Geometric Perspective on Optimal Representations for Reinforcement Learning , 2019, NeurIPS.

[30]  Masaaki Nagata,et al.  Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction , 2019, ACL.

[31]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[34]  Neural Speed Reading with Structural-Jump-LSTM , 2019, ICLR.

[35]  Sanda M. Harabagiu,et al.  FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[36]  Ankur Taly,et al.  Did the Model Understand the Question? , 2018, ACL.

[37]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.