论文信息 - HEAD-QA: A Healthcare Dataset for Complex Reasoning - 字舞流文

HEAD-QA: A Healthcare Dataset for Complex Reasoning

We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning. The questions come from exams to access a specialized position in the Spanish healthcare system, and are challenging even for highly specialized humans. We then consider monolingual (Spanish) and cross-lingual (to English) experiments with information retrieval and neural techniques. We show that: (i) HEAD-QA challenges current methods, and (ii) the results lag well behind human performance, demonstrating its usefulness as a benchmark for future work.

David Vilares | Carlos Gómez-Rodríguez | David Vilares | Carlos Gómez-Rodríguez

[1] Paolo Rosso,et al. Mining Knowledge fromWikipedia for the Question Answering task , 2006, LREC.

[2] Yiannis Kompatsiaris,et al. A Test Collection for Passage Retrieval Evaluation of Spanish Health-Related Resources , 2019, ECIR.

[3] Percy Liang,et al. Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[4] Fernando Llopis,et al. Question Answering in Spanish , 2003, CLEF.

[5] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[6] Yejin Choi,et al. SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.

[7] Asma Ben Abacha,et al. Semantic Analysis and Automatic Corpus Construction for Entailment Recognition in Medical Texts , 2015, AIME.

[8] Zachary C. Lipton,et al. How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks , 2018, EMNLP.

[9] Ioannis A. Kakadiaris,et al. Results of the sixth edition of the BioASQ Challenge , 2018 .

[10] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[11] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[13] Ming Zhou,et al. Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[14] Philipp Koehn,et al. Manual and Automatic Evaluation of Machine Translation between European Languages , 2006, WMT@HLT-NAACL.

[15] Hwee Tou Ng,et al. A Nil-Aware Answer Extraction Framework for Question Answering , 2018, EMNLP.

[16] Peter Clark,et al. SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.

[17] Ali Farhadi,et al. Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[18] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.

[19] Oren Etzioni,et al. Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions , 2016, AAAI.

[20] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[21] Jonghyun Choi,et al. Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Richard Socher,et al. DCN+: Mixed Objective and Deep Residual Coattention for Question Answering , 2017, ICLR.

[23] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[24] Asma Ben Abacha,et al. Recognizing Question Entailment for Medical Question Answering , 2016, AMIA.