A Multi-Channel Convolutional Neural Network approach to automate the citation screening process

Abstract The systematic literature review (SLR) process is separated into several steps to increase rigor and reproducibility. The selection of primary studies (i.e., citation screening) is an important step in the SLR process. The citation screening process aims to identify the relevant primary studies fairly and with high rigor using selection criteria. Through the study selection criteria, reviewers determine whether an article should be included or excluded from the SLR. However, the screening process is highly time-consuming and error-prone as the researchers must read each title and possibly hundreds to thousands of abstracts and full-text documents. This study aims to automate the citation screening process using Deep Learning algorithms. With this, it is aimed to reduce the time and costs of the citation screening process and increase the precision and recall of the relevant primary studies. A Multi-Channel Convolutional Neural Network (CNN) is proposed, which can automatically classify a given set of citations. As the architecture uses the title and abstract as features, our end-to-end pipeline is domain-independent. We have performed six experiments to assess the performance of Multi-Channel CNNs across 20 publicly available systematic literature review datasets. It was shown that for 18 out of 20 review datasets, the proposed method achieved significant workload savings of at least 10%, while in several cases, our model yielded a statistically significantly better performance over two benchmark review datasets. We conclude that Multi-Channel CNNs are effective for the citation screening process in SLRs. Multi-Channel CNNs perform best on large datasets of over 2500 samples with few abstracts missing.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Jing Liao,et al.  Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error , 2019, Systematic Reviews.

[4]  O. Dieste,et al.  Developing Search Strategies for Detecting Relevant Experiments for Systematic Reviews , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[5]  Isabel Segura-Bedmar,et al.  Comparing deep learning architectures for sentiment analysis on drug reviews , 2020, J. Biomed. Informatics.

[6]  Ruoyu Chen,et al.  A Text Sentiment Classification Modeling Method Based on Coordinated CNN‐LSTM‐Attention Model , 2019, Chinese Journal of Electronics.

[7]  Aaron M. Cohen,et al.  Letter: Performance of support-vector-machine-based classification on 15 systematic review topics evaluated with the WSS@95 measure , 2011, J. Am. Medical Informatics Assoc..

[8]  Jian-Yun Nie,et al.  Discriminating between empirical studies and nonempirical works using automated text classification , 2018, Research synthesis methods.

[9]  Stan Matwin,et al.  Building Systematic Reviews Using Automatic Text Classification Techniques , 2010, COLING.

[10]  Ronald Gualán,et al.  A ranking-based approach for supporting the initial selection of primary studies in a Systematic Literature Review , 2019, 2019 XLV Latin American Computing Conference (CLEI).

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Carla E. Brodley,et al.  Semi-automated screening of biomedical citations for systematic reviews , 2010, BMC Bioinformatics.

[13]  Zhiyuan Liu,et al.  A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[14]  Brahim Ouhbi,et al.  Using rule-based classifiers in systematic reviews: a semantic class association rules approach , 2015, iiWAS.

[15]  Juan Jose García Adeva,et al.  Automatic text classification to support systematic reviews in medicine , 2014, Expert Syst. Appl..

[16]  Taghi M. Khoshgoftaar,et al.  Survey on deep learning with class imbalance , 2019, J. Big Data.

[17]  Rosane Minghim,et al.  A visual analysis approach to validate the selection review of primary studies in systematic reviews , 2012, Inf. Softw. Technol..

[18]  Bedir Tekinerdogan,et al.  Model-based testing for software safety: a systematic mapping study , 2017, Software Quality Journal.

[19]  M Bartholomew,et al.  James Lind’s Treatise of the Scurvy (1753) , 2002, Postgraduate medical journal.

[20]  Dina Demner-Fushman,et al.  Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers , 2012, Artif. Intell. Medicine.

[21]  Jianfeng Gao,et al.  Deep Learning Based Text Classification: A Comprehensive Review , 2020, ArXiv.

[22]  Stan Matwin,et al.  Exploiting the systematic review protocol for classification of medical abstracts , 2011, Artif. Intell. Medicine.

[23]  Prem Timsina,et al.  Advanced analytics for the automation of medical systematic reviews , 2016, Inf. Syst. Frontiers.

[24]  Tingting Mu,et al.  A semi-supervised approach using label propagation to support citation screening , 2017, J. Biomed. Informatics.

[25]  Stephen G. MacDonell,et al.  A visual analysis approach to update systematic reviews , 2014, EASE '14.

[26]  Lisa Hartling,et al.  Systematic review of the use of process evaluations in knowledge translation research , 2019, Systematic Reviews.

[27]  Aaron M. Cohen,et al.  Research Paper: Cross-Topic Learning for Work Prioritization in Systematic Review Creation and Update , 2009, J. Am. Medical Informatics Assoc..

[28]  Carla E. Brodley,et al.  Active learning for biomedical citation screening , 2010, KDD.

[29]  Enrico Coiera,et al.  Automated screening of research studies for systematic reviews using study characteristics , 2018, Systematic Reviews.

[30]  Shanthi Nagarajan,et al.  IKKβ inhibitor identification: a multi-filter driven novel scaffold , 2010, BMC Bioinformatics.

[31]  Abhishek Verma,et al.  Deep CNN-LSTM with combined kernels from multiple branches for IMDb review sentiment analysis , 2017, 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON).

[32]  Thiago R. P. M. Rúbio,et al.  Enhancing academic literature review through relevance recommendation: Using bibliometric and text-based features for classification , 2016, 2016 11th Iberian Conference on Information Systems and Technologies (CISTI).

[33]  Seunghee Kim,et al.  An SVM-based high-quality article classifier for systematic reviews , 2014, J. Biomed. Informatics.

[34]  Bedir Tekinerdogan,et al.  A decision support system for automating document retrieval and citation screening , 2021, Expert Syst. Appl..

[35]  Dinesh Kumar Vishwakarma,et al.  Sentiment analysis using deep learning architectures: a review , 2019, Artificial Intelligence Review.

[36]  Duy Duc An Bui,et al.  Extractive text summarization system to aid data extraction from full text in systematic review development , 2016, J. Biomed. Informatics.

[37]  William R. Hersh,et al.  Reducing workload in systematic review preparation using automated citation classification. , 2006, Journal of the American Medical Informatics Association : JAMIA.

[38]  K. Bretonnel Cohen,et al.  The structural and content aspects of abstracts versus bodies of full text journal articles are different , 2010, BMC Bioinformatics.

[39]  Brian E. Howard,et al.  SWIFT-Review: a text-mining workbench for systematic review , 2016, Systematic Reviews.

[40]  Bedir Tekinerdogan,et al.  Automation of systematic literature reviews: A systematic literature review , 2021, Inf. Softw. Technol..

[41]  Ioannis Korkontzelos,et al.  Using a neural network-based feature extraction method to facilitate citation screening for systematic reviews , 2020, Expert Syst. Appl. X.

[42]  Pearl Brereton,et al.  The use of bibliography enriched features for automatic citation screening , 2019, J. Biomed. Informatics.

[43]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  P. Glasziou,et al.  Are systematic reviews up-to-date at the time of publication? , 2013, Systematic Reviews.

[45]  Yoav Goldberg,et al.  Understanding Convolutional Neural Networks for Text Classification , 2018, BlackboxNLP@EMNLP.

[46]  Sophia Ananiadou,et al.  Topic detection using paragraph vectors to support active learning in systematic reviews , 2016, J. Biomed. Informatics.

[47]  Stan Matwin,et al.  A new algorithm for reducing the workload of experts in performing systematic reviews , 2010, J. Am. Medical Informatics Assoc..

[48]  Adrian Tsang,et al.  Data Sampling and Supervised Learning for HIV Literature Screening , 2016, IEEE Transactions on NanoBioscience.