Inferring Which Medical Treatments Work from Reports of Clinical Trials

How do we know if a particular medical treatment actually works? Ideally one would consult all available evidence from relevant clinical trials. Unfortunately, such results are primarily disseminated in natural language scientific articles, imposing substantial burden on those trying to make sense of them. In this paper, we present a new task and corpus for making this unstructured evidence actionable. The task entails inferring reported findings from a full-text article describing a randomized controlled trial (RCT) with respect to a given intervention, comparator, and outcome of interest, e.g., inferring if an article provides evidence supporting the use of aspirin to reduce risk of stroke, as compared to placebo. We present a new corpus for this task comprising 10,000+ prompts coupled with full-text articles describing RCTs. Results using a suite of models --- ranging from heuristic (rule-based) approaches to attentive neural architectures --- demonstrate the difficulty of the task, which we believe largely owes to the lengthy, technical input texts. To facilitate further work on this important, challenging problem we make the corpus, documentation, a website and leaderboard, and code for baselines and evaluation available at this http URL.

[1]  Percy Liang,et al.  Data Recombination for Neural Semantic Parsing , 2016, ACL.

[2]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[3]  Sampo Pyysalo,et al.  BioCause: Annotating and analysing causality in the biomedical domain , 2013, BMC Bioinformatics.

[4]  Zachary C. Lipton,et al.  How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks , 2018, EMNLP.

[5]  Ana Lucic,et al.  Automatic endpoint detection to support the systematic review process , 2015, J. Biomed. Informatics.

[6]  Byron C. Wallace,et al.  Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide , 2018, Research synthesis methods.

[7]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[10]  Eunsol Choi,et al.  Coarse-to-Fine Question Answering for Long Documents , 2016, ACL.

[11]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[12]  Shlomo Argamon,et al.  Automatic Summarization of Results from Clinical Trials , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[13]  Tapio Salakoski,et al.  Distributional Semantics Resources for Biomedical Text Processing , 2013 .

[14]  Mitesh M. Khapra,et al.  Show Me Your Evidence - an Automatic Method for Context Dependent Evidence Detection , 2015, EMNLP.

[15]  H. Bastian,et al.  Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? , 2010, PLoS medicine.

[16]  Claire Cardie,et al.  Empirical Methods in Information Extraction , 1997, AI Mag..

[17]  Alexander M. Rush,et al.  Structured Attention Networks , 2017, ICLR.

[18]  Joel D. Martin,et al.  ExaCT: automatic extraction of clinical trial characteristics from journal publications , 2010, BMC Medical Informatics Decis. Mak..

[19]  Christine D. Piatko,et al.  Using “Annotator Rationales” to Improve Machine Learning for Text Categorization , 2007, NAACL.

[20]  Sophia Ananiadou,et al.  Categorising Modality in Biomedical Texts , 2008, LREC 2008.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[23]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[24]  Enrico Coiera,et al.  The automation of systematic reviews , 2013, BMJ.

[25]  Catherine Blake,et al.  Identifying Comparative Claim Sentences in Full-Text Scientific Articles , 2012, ACL 2012.

[26]  Ye Zhang,et al.  Rationale-Augmented Convolutional Neural Networks for Text Classification , 2016, EMNLP.

[27]  Sanda M. Harabagiu,et al.  Experiments with Open-Domain Textual Question Answering , 2000, COLING.

[28]  S. Colagiuri,et al.  Liraglutide, a once-daily human GLP-1 analogue, added to a sulphonylurea over 26 weeks produces greater improvements in glycaemic and weight control compared with adding rosiglitazone or placebo in subjects with Type 2 diabetes (LEAD-1 SU) , 2009, Diabetic medicine : a journal of the British Diabetic Association.