Question Answering is a Format; When is it Useful?

Recent years have seen a dramatic expansion of tasks and datasets posed as question answering, from reading comprehension, semantic role labeling, and even machine translation, to image and video understanding. With this expansion, there are many differing views on the utility and definition of "question answering" itself. Some argue that its scope should be narrow, or broad, or that it is overused in datasets today. In this opinion piece, we argue that question answering should be considered a format which is sometimes useful for studying particular phenomena, not a phenomenon or task in itself. We discuss when a task is correctly described as question answering, and when a task is usefully posed as question answering, instead of using some other format.

[1]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[2]  Peter Clark,et al.  Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.

[3]  Myle Ott,et al.  On The Evaluation of Machine Translation SystemsTrained With Back-Translation , 2019, ACL.

[4]  Ido Dagan,et al.  Crowdsourcing Question-Answer Meaning Representations , 2017, NAACL.

[5]  Rajarshi Das,et al.  Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension , 2018, ICLR.

[6]  Kyunghyun Cho,et al.  SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[7]  Hui Wang,et al.  Neural Machine Reading Comprehension: Methods and Trends , 2019 .

[8]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[9]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[10]  Alexander I. Rudnicky,et al.  Expanding the Scope of the ATIS Task: The ATIS-3 Corpus , 1994, HLT.

[11]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[12]  Richard Socher,et al.  The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.

[13]  Ming-Wei Chang,et al.  Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[14]  Alvin Cheung,et al.  Learning a Neural Semantic Parser from User Feedback , 2017, ACL.

[15]  Gabriel Stanovsky,et al.  DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.

[16]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[17]  Luke S. Zettlemoyer,et al.  Large-Scale QA-SRL Parsing , 2018, ACL.

[18]  Yejin Choi,et al.  MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms , 2019, NAACL.

[19]  Christopher D. Manning,et al.  GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[21]  David Berthelot,et al.  WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia , 2016, ACL.

[22]  Lei Yu,et al.  Learning and Evaluating General Linguistic Intelligence , 2019, ArXiv.

[23]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[24]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[25]  Ming-Wei Chang,et al.  BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.

[26]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[27]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[28]  Li Fei-Fei,et al.  CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Sameer Singh,et al.  Compositional Questions Do Not Necessitate Multi-hop Reasoning , 2019, ACL.

[30]  An Yang,et al.  Machine Reading Comprehension: a Literature Review , 2019, ArXiv.

[31]  Rachel Rudinger,et al.  Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation , 2018, BlackboxNLP@EMNLP.

[32]  Luke S. Zettlemoyer,et al.  Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language , 2015, EMNLP.

[33]  Mingxin Zhou,et al.  Entity-Relation Extraction as Multi-Turn Question Answering , 2019, ACL.

[34]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[35]  Jonathan Berant,et al.  CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.

[36]  Ali Farhadi,et al.  From Recognition to Cognition: Visual Commonsense Reasoning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yejin Choi,et al.  Social IQA: Commonsense Reasoning about Social Interactions , 2019, EMNLP 2019.