Methods and Tools to Enhance Rigor and Reproducibility of Biomedical Research

Rigor and reproducibility of biomedical research has been the topic of much debate in recent years. Cases of replication failures, increasing number of retractions, and pervasiveness of questionable research practices lead to a lack of confidence in published findings, indicating that a portion of the biomedical research investment is wasted. Some ongoing efforts aim to address the issues in research conduct and dissemination by focusing mainly on standardization. Guidelines and principles pertaining to data, code, and publications have been proposed. The goal of this didactic panel is to engage the medical informatics community in a discussion about strategies to complement such efforts using informatics methods, tools, and resources. In the panel, first, we will provide a brief overview of standardization initiatives. Next, the panelists will present their informatics-based approaches toward improving rigor and reproducibility of biomedical research, focusing on such areas as information retrieval, natural language processing/text mining, and semantic modeling. Finally, with audience participation, we will discuss challenges facing informatics research aiming to address these problems and seek to identify some potentially fruitful research directions.

[1]  Matthew Kim,et al.  ProvCaRe Semantic Provenance Knowledgebase: Evaluating Scientific Reproducibility of Research Studies , 2017, AMIA.

[2]  F. Collins,et al.  Policy: NIH plans to enhance reproducibility , 2014, Nature.

[3]  Mark Stevenson,et al.  An IR-Based Approach Utilizing Query Expansion for Plagiarism Detection in MEDLINE , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  Byron C. Wallace,et al.  RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials , 2015, J. Am. Medical Informatics Assoc..

[5]  P. Glasziou,et al.  Avoidable waste in the production and reporting of research evidence , 2009, The Lancet.

[6]  Ka Yee Yeung,et al.  Reproducible Bioconductor Workflows Using Browser-based Interactive Notebooks and Containers , 2017, bioRxiv.

[7]  C. Glenn Begley,et al.  Raise standards for preclinical cancer research , 2012 .

[8]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[9]  Zhiyong Lu,et al.  Extraction of data deposition statements from the literature: a method for automatically tracking research results , 2011, Bioinform..

[10]  D. Moher,et al.  Transparent and accurate reporting increases reliability, utility, and impact of your research: reporting guidelines and the EQUATOR Network , 2010, BMC medicine.

[11]  Halil Kilicoglu,et al.  Biomedical Text Mining for Research Rigor and Integrity: Tasks, Challenges, Directions , 2017, bioRxiv.

[12]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[13]  Lucila Ohno-Machado,et al.  DataMed – an open source discovery index for finding biomedical datasets , 2018, J. Am. Medical Informatics Assoc..

[14]  S. Ananiadou,et al.  Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[15]  Byron C. Wallace,et al.  Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision , 2016, J. Mach. Learn. Res..

[16]  Carole A. Goble,et al.  Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications , 2013, Journal of Biomedical Semantics.

[17]  Brian A. Nosek,et al.  Making sense of replications , 2017, eLife.

[18]  A. Casadevall,et al.  Misconduct accounts for the majority of retracted scientific publications , 2012, Proceedings of the National Academy of Sciences.

[19]  Joel D. Martin,et al.  ExaCT: automatic extraction of clinical trial characteristics from journal publications , 2010, BMC Medical Informatics Decis. Mak..

[20]  Leo A. Celi,et al.  The MIMIC Code Repository: enabling reproducibility in critical care research , 2017, J. Am. Medical Informatics Assoc..

[21]  Florence T. Bourgeois,et al.  Document similarity measures can support semi-automated identification of unreported links between trial registrations and published reports , 2017, Journal of clinical epidemiology.