Detecting Contradictory COVID-19 Drug Efficacy Claims from Biomedical Literature

The COVID-19 pandemic created a deluge of questionable and contradictory scientific claims about drug efficacy -- an "infodemic" with lasting consequences for science and society. In this work, we argue that NLP models can help domain experts distill and understand the literature in this complex, high-stakes area. Our task is to automatically identify contradictory claims about COVID-19 drug efficacy. We frame this as a natural language inference problem and offer a new NLI dataset created by domain experts. The NLI framing allows us to create curricula combining existing datasets and our own. The resulting models are useful investigative tools. We provide a case study of how these models help a domain expert summarize and assess evidence concerning remdisivir and hydroxychloroquine.

[1]  Daniel N. Sosa,et al.  Contexts and contradictions: a roadmap for computational drug repurposing with knowledge inference , 2022, Briefings Bioinform..

[2]  K. Boyack,et al.  Massive covidization of research citations and the citation elite , 2022, medRxiv.

[3]  Q. Zhou,et al.  Knowledge Graph-Based Approaches to Drug Repurposing for COVID-19 , 2021, J. Chem. Inf. Model..

[4]  Stefano E. Rensi,et al.  Repurposing biomedical informaticians for COVID-19 , 2021, Journal of Biomedical Informatics.

[5]  Masayasu Atsumi,et al.  Pre-training a BERT with Curriculum Learning by Increasing Block-Size of Input Text , 2021, RANLP.

[6]  R. Altman,et al.  Analyzing the vast coronavirus literature with CoronaCentral , 2020, bioRxiv.

[7]  J. Ioannidis,et al.  The rapid, massive growth of COVID-19 authors in the scientific literature , 2020, bioRxiv.

[8]  J. Nemunaitis,et al.  FDA efficiency for approval process of COVID-19 therapeutics , 2020, Infectious agents and cancer.

[9]  P. Slomka,et al.  Preprint manuscripts and servers in the era of coronavirus disease 2019. , 2020, Journal of evaluation in clinical practice.

[10]  Yongdong Zhang,et al.  Curriculum Learning for Natural Language Understanding , 2020, ACL.

[11]  Debojyoti Moulick,et al.  Evaluating the potential of different inhibitors on RNA-dependent RNA polymerase of severe acute respiratory syndrome coronavirus 2: A molecular modeling approach , 2020, Medical Journal Armed Forces India.

[12]  J. Qu,et al.  Hydroxychloroquine in patients with mainly mild to moderate coronavirus disease 2019: open label, randomised controlled trial , 2020, BMJ.

[13]  D. Wang,et al.  Hydroxychloroquine application is associated with a decreased mortality in critically ill patients with COVID-19 , 2020, medRxiv.

[14]  Oren Etzioni,et al.  CORD-19: The Covid-19 Open Research Dataset , 2020, NLPCOVID19.

[15]  S. Anzick,et al.  Clinical benefit of remdesivir in rhesus macaques infected with SARS-CoV-2 , 2020, Nature.

[16]  M. Mckee,et al.  Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009-2018. , 2020, JAMA.

[17]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[18]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[19]  Russ B. Altman,et al.  A Literature-Based Knowledge Graph Embedding Method for Identifying Drug Repurposing Opportunities in Rare Diseases , 2019, bioRxiv.

[20]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[21]  Konrad Paul Kording,et al.  Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning , 2019, arXiv.org.

[22]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[23]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[24]  P. Sanseau,et al.  Drug repurposing: progress, challenges and recommendations , 2018, Nature Reviews Drug Discovery.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Alexey Romanov,et al.  Lessons from Natural Language Inference in the Clinical Domain , 2018, EMNLP.

[27]  Kawin Ethayarajh,et al.  Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline , 2018, Rep4NLP@ACL.

[28]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[29]  Jacob K. Asiedu,et al.  The Drug Repurposing Hub: a next-generation drug library and information resource , 2017, Nature Medicine.

[30]  Dexter Hadley,et al.  Systematic integration of biomedical knowledge prioritizes drugs for repurposing , 2017, bioRxiv.

[31]  Mark Stevenson,et al.  A corpus of potentially contradictory research claims from cardiovascular research abstracts , 2016, J. Biomed. Semant..

[32]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[33]  Yong-Jian Wu,et al.  Fish consumption and incidence of heart failure: a meta-analysis of prospective cohort studies. , 2013, Chinese medical journal.

[34]  Hsin-Chieh Yeh,et al.  Effect of the 2011 vs 2003 duty hour regulation-compliant models on sleep duration, trainee education, and continuity of patient care among internal medicine house staff: a randomized trial. , 2013, JAMA internal medicine.

[35]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[36]  Paul A. Fontelo,et al.  Utilization of the PICO framework to improve searching PubMed for clinical questions , 2007, BMC Medical Informatics Decis. Mak..