A decision support system for automating document retrieval and citation screening

Abstract The systematic literature review (SLR) process includes several steps to collect secondary data and analyze it to answer research questions. In this context, the document retrieval and primary study selection steps are heavily intertwined and known for their repetitiveness, high human workload, and difficulty identifying all relevant literature. This study aims to reduce human workload and error of the document retrieval and primary study selection processes using a decision support system (DSS). An open-source DSS is proposed that supports the document retrieval step, dataset preprocessing, and citation classification. The DSS is domain-independent, as it has proven to carefully select an article’s relevance based solely on the title and abstract. These features can be consistently retrieved from scientific database APIs. Additionally, the DSS is designed to run in the cloud without any required programming knowledge for reviewers. A Multi-Channel CNN architecture is implemented to support the citation screening process. With the provided DSS, reviewers can fill in their search strategy and manually label only a subset of the citations. The remaining unlabeled citations are automatically classified and sorted based on probability. It was shown that for four out of five review datasets, the DSS's use achieved significant workload savings of at least 10%. The cross-validation results show that the system provides consistent results up to 88.3% of work saved during citation screening. In two cases, our model yielded a better performance over the benchmark review datasets. As such, the proposed approach can assist the development of systematic literature reviews independent of the domain. The proposed DSS is effective and can substantially decrease the document retrieval and citation screening steps' workload and error rate.

[1]  Brian E. Howard,et al.  SWIFT-Review: a text-mining workbench for systematic review , 2016, Systematic Reviews.

[2]  Juan Jose García Adeva,et al.  Automatic text classification to support systematic reviews in medicine , 2014, Expert Syst. Appl..

[3]  P. Glasziou,et al.  Systematic review automation technologies , 2014, Systematic Reviews.

[4]  Bedir Tekinerdogan,et al.  Automation of systematic literature reviews: A systematic literature review , 2021, Inf. Softw. Technol..

[5]  Stephen G. MacDonell,et al.  A visual analysis approach to update systematic reviews , 2014, EASE '14.

[6]  Sophia Ananiadou,et al.  Topic detection using paragraph vectors to support active learning in systematic reviews , 2016, J. Biomed. Informatics.

[7]  Enrico Zio,et al.  A new hybrid model for wind speed forecasting combining long short-term memory neural network, decomposition methods and grey wolf optimizer , 2021, Appl. Soft Comput..

[8]  P. Glasziou,et al.  Are systematic reviews up-to-date at the time of publication? , 2013, Systematic Reviews.

[9]  Christopher Marshall Tool support for systematic reviews in software engineering , 2016 .

[10]  Matthew Michelson,et al.  The significant cost of systematic reviews and meta-analyses: A call for greater involvement of machine learning to assess the promise of clinical trials , 2019, Contemporary clinical trials communications.

[11]  Shanthi Nagarajan,et al.  IKKβ inhibitor identification: a multi-filter driven novel scaffold , 2010, BMC Bioinformatics.

[12]  Brahim Ouhbi,et al.  A hybrid feature selection rule measure and its application to systematic review , 2016, iiWAS.

[13]  Mourad Ouzzani,et al.  Making progress with the automation of systematic reviews: principles of the International Collaboration for the Automation of Systematic Reviews (ICASR) , 2018, Systematic Reviews.

[14]  Thiago R. P. M. Rúbio,et al.  Enhancing academic literature review through relevance recommendation: Using bibliometric and text-based features for classification , 2016, 2016 11th Iberian Conference on Information Systems and Technologies (CISTI).

[15]  Tore Dybå,et al.  Applying Systematic Reviews to Diverse Study Types: An Experience Report , 2007, ESEM 2007.

[16]  Sophia Ananiadou,et al.  Reducing systematic review workload through certainty-based screening , 2014, J. Biomed. Informatics.

[17]  Ioannis Korkontzelos,et al.  Using a neural network-based feature extraction method to facilitate citation screening for systematic reviews , 2020, Expert Syst. Appl. X.

[18]  Yoav Goldberg,et al.  Understanding Convolutional Neural Networks for Text Classification , 2018, BlackboxNLP@EMNLP.

[19]  Byron C. Wallace,et al.  Toward systematic review automation: a practical guide to using machine learning tools in research synthesis , 2019, Systematic Reviews.

[20]  Per Runeson,et al.  A Machine Learning Approach for Semi-Automated Search and Selection in Literature Studies , 2017, EASE.

[21]  Carla E. Brodley,et al.  Active learning for biomedical citation screening , 2010, KDD.

[22]  Brahim Ouhbi,et al.  Using rule-based classifiers in systematic reviews: a semantic class association rules approach , 2015, iiWAS.

[23]  Isabel Segura-Bedmar,et al.  Comparing deep learning architectures for sentiment analysis on drug reviews , 2020, J. Biomed. Informatics.

[24]  A J van Altena,et al.  Usage of automation tools in systematic reviews , 2019, Research synthesis methods.

[25]  William R. Hersh,et al.  Reducing workload in systematic review preparation using automated citation classification. , 2006, Journal of the American Medical Informatics Association : JAMIA.

[26]  Tingting Mu,et al.  A semi-supervised approach using label propagation to support citation screening , 2017, J. Biomed. Informatics.

[27]  Jian-Yun Nie,et al.  Discriminating between empirical studies and nonempirical works using automated text classification , 2018, Research synthesis methods.

[28]  Carla E. Brodley,et al.  Semi-automated screening of biomedical citations for systematic reviews , 2010, BMC Bioinformatics.

[29]  Rossitza Goleva,et al.  Automation in Systematic, Scoping and Rapid Reviews by an NLP Toolkit: A Case Study in Enhanced Living Environments , 2019, Enhanced Living Environments.

[30]  Isla Kuhn,et al.  Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation , 2020, BMC Medical Research Methodology.

[31]  Paulo Borba,et al.  An Estimation Model for Test Execution Effort , 2007, ESEM 2007.

[32]  V. Malheiros,et al.  A Visual Text Mining approach for Systematic Reviews , 2007, ESEM 2007.

[33]  Justin Clark,et al.  A full systematic review was completed in 2 weeks using automation tools: a case study , 2020 .

[34]  Guy Tsafnat,et al.  A question of trust: can we build an evidence base to gain trust in systematic review automation technologies? , 2019, Systematic Reviews.

[35]  Aaron M. Cohen,et al.  Research Paper: Cross-Topic Learning for Work Prioritization in Systematic Review Creation and Update , 2009, J. Am. Medical Informatics Assoc..

[36]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[37]  Duy Duc An Bui,et al.  Automatically finding relevant citations for clinical guideline development , 2015, J. Biomed. Informatics.