A question of trust: can we build an evidence base to gain trust in systematic review automation technologies?

BackgroundAlthough many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools.DiscussionWe discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see “others” in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments.ConclusionWe discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.

[1]  Pearl Brereton,et al.  Reproducibility of studies on text mining for citation screening in systematic reviews: Evaluation and checklist , 2017, J. Biomed. Informatics.

[2]  T. Garavan,et al.  Human Resource Development in SMEs: A Systematic Review of the Literature , 2016 .

[3]  Joel D. Martin,et al.  ExaCT: automatic extraction of clinical trial characteristics from journal publications , 2010, BMC Medical Informatics Decis. Mak..

[4]  Jm Thomas,et al.  Diffusion of innovation in systematic review methodology: why is study selection not yet assisted by automation? , 2013 .

[5]  Anton Nekrutenko,et al.  Ten Simple Rules for Reproducible Computational Research , 2013, PLoS Comput. Biol..

[6]  G. Guyatt,et al.  Living cumulative network meta-analysis to reduce waste in research: A paradigmatic shift for systematic reviews? , 2016, BMC Medicine.

[7]  Brian J. Hemens,et al.  Computer-Aided Systematic Review Screening Comes of Age , 2017, Annals of Internal Medicine.

[8]  Andrew W. Brown,et al.  Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry , 2017, BMJ Open.

[9]  Julie E Goodman,et al.  A primer on systematic reviews in toxicology , 2017, Archives of Toxicology.

[10]  Sophia Ananiadou,et al.  Erratum to: Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[11]  Minh Huynh,et al.  Estimation of diagnostic test accuracy without full verification: a review of latent class methods , 2014, Statistics in medicine.

[12]  Margaret Sullivan Pepe,et al.  Insights into latent class analysis of diagnostic test performance. , 2007, Biostatistics.

[13]  Byron C. Wallace,et al.  Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision , 2016, J. Mach. Learn. Res..

[14]  Cassidy R. Sugimoto,et al.  A systematic review of interactive information retrieval evaluation studies, 1967-2006 , 2013, J. Assoc. Inf. Sci. Technol..

[15]  Jeffrey C. Carver,et al.  Vision for SLR tooling infrastructure: Prioritizing value-added requirements , 2017, Inf. Softw. Technol..

[16]  Carla E. Brodley,et al.  Semi-automated screening of biomedical citations for systematic reviews , 2010, BMC Bioinformatics.

[17]  E. Rogers Diffusion of Innovations , 1962 .

[18]  J Glanville,et al.  Applicability and Feasibility of Systematic Review for Performing Evidence-Based Risk Assessment in Food and Feed Safety , 2015, Critical reviews in food science and nutrition.

[19]  BudgenDavid,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007 .

[20]  Dina Demner-Fushman,et al.  Towards Automating the Initial Screening Phase of a Systematic Review , 2010, MedInfo.

[21]  Natasa Milic-Frayling,et al.  Enslaved to the Trapped Data: A Cognitive Work Analysis of Medical Systematic Reviews , 2019, CHIIR.

[22]  Brandy R. Maynard,et al.  Use and Impacts of Campbell Systematic Reviews on Policy, Practice, and Research , 2018 .

[23]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[24]  James Miller,et al.  Replicating software engineering experiments: a poisoned chalice or the Holy Grail , 2005, Inf. Softw. Technol..

[25]  Tari Turner,et al.  Living Systematic Reviews: An Emerging Opportunity to Narrow the Evidence-Practice Gap , 2014, PLoS medicine.

[26]  Daniel M Fox,et al.  Evidence and Health Policy: Using and Regulating Systematic Reviews , 2017, American journal of public health.

[27]  Enrico Coiera,et al.  The automation of systematic reviews , 2013, BMJ.

[28]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[29]  David Moher,et al.  STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies , 2015, BMJ : British Medical Journal.

[30]  Marialena Vagia,et al.  A literature review on the levels of automation during the years. What are the different taxonomies that have been proposed? , 2016, Applied ergonomics.

[31]  Martin O'Flaherty,et al.  The Use of Research Evidence in Public Health Decision Making Processes: Systematic Review , 2011, PloS one.

[32]  Anthony C. Woodbury,et al.  Reproducible research in linguistics: A position statement on data citation and attribution in our field , 2017 .

[33]  S. Ananiadou,et al.  Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[34]  Vladimir Stantchev,et al.  Factors for the Management of Scarce Human Resources and Highly Skilled Employees in IT-Departments: A Systematic Review , 2016, J. Inf. Technol. Res..

[35]  P. Glasziou,et al.  Systematic review automation technologies , 2014, Systematic Reviews.

[36]  Fred D. Davis Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology , 1989, MIS Q..

[37]  J. Higgins Cochrane handbook for systematic reviews of interventions. Version 5.1.0 [updated March 2011]. The Cochrane Collaboration , 2011 .

[38]  D. Fox Systematic reviews and health policy: the influence of a project on perinatal care since 1988. , 2011, The Milbank quarterly.

[39]  Yindalon Aphinyanagphongs,et al.  Research Paper: Text Categorization Models for High-Quality Article Retrieval in Internal Medicine , 2004, J. Am. Medical Informatics Assoc..