Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat

Systematic reviews in e.g. empirical medicine address research questions by comprehensively examining the entire published literature. Conventionally, manual literature surveys decide inclusion in two steps, first based on abstracts and title, then by full text, yet current methods to automate the process make no distinction between gold data from these two stages. In this work we compare the impact different schemes for choosing positive and negative examples from the different screening stages have on the training of automated systems. We train a ranker using logistic regression and evaluate it on a new gold standard dataset for clinical NLP, and on an existing gold standard dataset for drug class efficacy. The classification and ranking achieves an average AUC of 0.803 and 0.768 when relying on gold standard decisions based on title and abstracts of articles, and an AUC of 0.625 and 0.839 when relying on gold standard decisions based on full text. Our results suggest that it makes little difference which screening stage the gold standard decisions are drawn from, and that the decisions need not be based on the full text. The results further suggest that common-off-the-shelf algorithms can reduce the amount of work required to retrieve relevant literature.

[1]  Halil Kilicoglu,et al.  Combining Relevance Assignment with Quality of the Evidence to Support Guideline Development , 2010, MedInfo.

[2]  Aaron M. Cohen,et al.  An Effective General Purpose Approach for Automated Biomedical Document Classification , 2006, AMIA.

[3]  P Zweigenbaum,et al.  Making Sense of Big Textual Data for Health Care: Findings from the Section on Clinical Natural Language Processing , 2017, Yearbook of Medical Informatics.

[4]  Aaron M. Cohen,et al.  Letter: Performance of support-vector-machine-based classification on 15 systematic review topics evaluated with the WSS@95 measure , 2011, J. Am. Medical Informatics Assoc..

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  Aurélie Névéol,et al.  LIMSI@CLEF eHealth 2017 Task 2: Logistic Regression for Automatic Article Ranking , 2017, CLEF.

[7]  Susanne Hempel,et al.  A Pilot Study Using Machine Learning and Domain Knowledge to Facilitate Comparative Effectiveness Review Updating , 2013, Medical decision making : an international journal of the Society for Medical Decision Making.

[8]  Benno Stein,et al.  The Impact of Spelling Errors on Patent Search , 2012, EACL.

[9]  Kurt P Spindler,et al.  How to Write a Systematic Review , 2007, Clinical orthopaedics and related research.

[10]  Juan Jose García Adeva,et al.  Automatic text classification to support systematic reviews in medicine , 2014, Expert Syst. Appl..

[11]  Ahmed K. Elmagarmid,et al.  Learning to identify relevant studies for systematic reviews using random forest and external information , 2015, Machine Learning.

[12]  Dina Demner-Fushman,et al.  Towards Automating the Initial Screening Phase of a Systematic Review , 2010, MedInfo.

[13]  S. Ananiadou,et al.  Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[14]  He Zhang,et al.  Towards evidence-based ontology for supporting Systematic Literature Review , 2012, EASE.

[15]  Aaron M. Cohen,et al.  Optimizing Feature Representation for Automated Systematic Review Work Prioritization , 2008, AMIA.

[16]  Siddhartha Jonnalagadda,et al.  A new iterative method to reduce workload in systematic review process , 2013, Int. J. Comput. Biol. Drug Des..

[17]  Aaron M. Cohen,et al.  Research Paper: Cross-Topic Learning for Work Prioritization in Systematic Review Creation and Update , 2009, J. Am. Medical Informatics Assoc..

[18]  William R. Hersh,et al.  Reducing workload in systematic review preparation using automated citation classification. , 2006, Journal of the American Medical Informatics Association : JAMIA.

[19]  P Zweigenbaum,et al.  Clinical Natural Language Processing in 2015: Leveraging the Variety of Texts of Clinical Interest , 2016, Yearbook of Medical Informatics.

[20]  P. Glasziou,et al.  Systematic review automation technologies , 2014, Systematic Reviews.

[21]  Stan Matwin,et al.  A new algorithm for reducing the workload of experts in performing systematic reviews , 2010, J. Am. Medical Informatics Assoc..

[22]  Stan Matwin,et al.  Direct comparison between support vector machine and multinomial naive Bayes algorithms for medical abstract classification , 2012, J. Am. Medical Informatics Assoc..

[23]  Maura R. Grossman,et al.  Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review , 2011 .

[24]  Marian McDonagh,et al.  A Prospective Evaluation of an Automated Classification System to Support Evidence-based Medicine and Systematic Review. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.