A Text-Mining Framework for Supporting Systematic Reviews.

Systematic reviews (SRs) involve the identification, appraisal, and synthesis of all relevant studies for focused questions in a structured reproducible manner. High-quality SRs follow strict procedures and require significant resources and time. We investigated advanced text-mining approaches to reduce the burden associated with abstract screening in SRs and provide high-level information summary. A text-mining SR supporting framework consisting of three self-defined semantics-based ranking metrics was proposed, including keyword relevance, indexed-term relevance and topic relevance. Keyword relevance is based on the user-defined keyword list used in the search strategy. Indexed-term relevance is derived from indexed vocabulary developed by domain experts used for indexing journal articles and books. Topic relevance is defined as the semantic similarity among retrieved abstracts in terms of topics generated by latent Dirichlet allocation, a Bayesian-based model for discovering topics. We tested the proposed framework using three published SRs addressing a variety of topics (Mass Media Interventions, Rectal Cancer and Influenza Vaccine). The results showed that when 91.8%, 85.7%, and 49.3% of the abstract screening labor was saved, the recalls were as high as 100% for the three cases; respectively. Relevant studies identified manually showed strong topic similarity through topic analysis, which supported the inclusion of topic analysis as relevance metric. It was demonstrated that advanced text mining approaches can significantly reduce the abstract screening labor of SRs and provide an informative summary of relevant studies.

[1]  M M Ogilvie,et al.  Spontaneous abortion after hand-foot-and-mouth disease caused by Coxsackie virus A16. , 1980, British medical journal.

[2]  N. Laird,et al.  Meta-analysis in clinical trials. , 1986, Controlled clinical trials.

[3]  J R Teagarden,et al.  Meta‐Analysis: Whither Narrative Review? , 1989, Pharmacotherapy.

[4]  C. Mulrow,et al.  Systematic Reviews: Rationale for systematic reviews , 1994, BMJ.

[5]  H. Lowe,et al.  Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. , 1994, JAMA.

[6]  D. Woods,et al.  Medline and Embase complement each other in literature searches , 1998, BMJ.

[7]  I. Olkin,et al.  Estimating time to conduct a meta-analysis from number of citations retrieved. , 1999, JAMA.

[8]  G. D'Alonzo Beyond MEDLINE , 2000, The Journal of the American Osteopathic Association.

[9]  Douglas G Altman,et al.  Interaction revisited: the difference between two estimates , 2003, BMJ : British Medical Journal.

[10]  A. Kazanjian,et al.  BEYOND MEDLINE , 2003, International Journal of Technology Assessment in Health Care.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Otis Gospodnetic,et al.  Lucene in Action , 2004 .

[13]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[14]  Yindalon Aphinyanagphongs,et al.  Research Paper: Text Categorization Models for High-Quality Article Retrieval in Internal Medicine , 2004, J. Am. Medical Informatics Assoc..

[15]  Andrew McCallum,et al.  Group and topic discovery from relations and text , 2005, LinkKDD '05.

[16]  William R. Hersh,et al.  Reducing workload in systematic review preparation using automated citation classification. , 2006, Journal of the American Medical Informatics Association : JAMIA.

[17]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Carla E. Brodley,et al.  Semi-automated screening of biomedical citations for systematic reviews , 2010, BMC Bioinformatics.

[19]  Weiyi Meng,et al.  A Latent Topic Model for Complete Entity Resolution , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[20]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Víctor Fresno-Fernández,et al.  Integrating the Probabilistic Models BM25/BM25F into Lucene , 2009, ArXiv.

[22]  Min Zhu,et al.  Identifying functional miRNA-mRNA regulatory modules with correspondence latent dirichlet allocation , 2010, Bioinform..

[23]  Dina Demner-Fushman,et al.  Towards Automating the Initial Screening Phase of a Systematic Review , 2010, MedInfo.

[24]  Akshay Shirahatti Text Retrieval for Systematic Reviews , 2010 .

[25]  J. Qiu,et al.  Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDA , 2011, PloS one.

[26]  Aaron M. Cohen,et al.  Letter: Performance of support-vector-machine-based classification on 15 systematic review topics evaluated with the WSS@95 measure , 2011, J. Am. Medical Informatics Assoc..

[27]  Maureen Dobbins,et al.  An optimal search filter for retrieving systematic reviews and meta-analyses , 2012, BMC Medical Research Methodology.

[28]  T. Jefferson,et al.  Vaccines for preventing influenza in healthy children. , 2012, The Cochrane database of systematic reviews.

[29]  Shuying Shen,et al.  Evaluating the state of the art in coreference resolution for electronic medical records , 2012, J. Am. Medical Informatics Assoc..

[30]  H. Harling,et al.  Postoperative adjuvant chemotherapy in rectal cancer operated for cure . ( Review , 2022 .

[31]  Dina Demner-Fushman,et al.  Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers , 2012, Artif. Intell. Medicine.

[32]  Graham Thornicroft,et al.  Mass media interventions for reducing mental health-related stigma (Protocol) , 2011 .

[33]  Siddhartha Jonnalagadda,et al.  Towards assigning references using semantic, journal and citation relevance , 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine.

[34]  Siddhartha Jonnalagadda,et al.  A new iterative method to reduce workload in systematic review process , 2013, Int. J. Comput. Biol. Drug Des..

[35]  Hongfang Liu,et al.  Discovering Associations Among Diagnosis Groups Using Topic Modeling , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[36]  Hui Liu,et al.  Detection of type 2 diabetes related modules and genes based on epigenetic networks , 2014, BMC Systems Biology.

[37]  Dina Demner-Fushman,et al.  Feature Engineering and a Proposed Decision-Support System for Systematic Reviewers of Medical Evidence , 2014, PloS one.

[38]  Sophia Ananiadou,et al.  Reducing systematic review workload through certainty-based screening , 2014, J. Biomed. Informatics.

[39]  G. Guyatt,et al.  How to read a systematic review and meta-analysis and apply the results to patient care: users' guides to the medical literature. , 2014, JAMA.

[40]  Zhen Wang,et al.  Reducing the Screening Burden of Systematic Review with a Multiple-level Relevance Ranking System , 2014, AMIA.

[41]  S. Ananiadou,et al.  Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[42]  A. Taddio,et al.  Psychological interventions for needle-related procedural pain and distress in children and adolescents. , 2015, Paediatrics & child health.

[43]  R. Serra,et al.  Skin grafting for the treatment of chronic leg ulcers – a systematic review in evidence‐based medicine , 2017, International wound journal.