Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews

OBJECTIVE Systematic Reviews (SRs) are utilized to summarize evidence from high quality studies and are considered the preferred source of evidence-based practice (EBP). However, conducting SRs can be time and labor intensive due to the high cost of article screening. In previous studies, we demonstrated utilizing established (lexical) article relationships to facilitate the identification of relevant articles in an efficient and effective manner. Here we propose to enhance article relationships with background semantic knowledge derived from Unified Medical Language System (UMLS) concepts and ontologies. METHODS We developed a pipelined semantic concepts representation process to represent articles from an SR into an optimized and enriched semantic space of UMLS concepts. Throughout the process, we leveraged concepts and concept relations encoded in biomedical ontologies (SNOMED-CT and MeSH) within the UMLS framework to prompt concept features of each article. Article relationships (similarities) were established and represented as a semantic article network, which was readily applied to assist with the article screening process. We incorporated the concept of active learning to simulate an interactive article recommendation process, and evaluated the performance on 15 completed SRs. We used work saved over sampling at 95% recall (WSS95) as the performance measure. RESULTS We compared the WSS95 performance of our ontology-based semantic approach to existing lexical feature approaches and corpus-based semantic approaches, and found that we had better WSS95 in most SRs. We also had the highest average WSS95 of 43.81% and the highest total WSS95 of 657.18%. CONCLUSION We demonstrated using ontology-based semantics to facilitate the identification of relevant articles for SRs. Effective concepts and concept relations derived from UMLS ontologies can be utilized to establish article semantic relationships. Our approach provided a promising performance and can easily apply to any SR topics in the biomedical domain with generalizability.

[1]  Xiaonan Ji,et al.  Using MEDLINE Elemental Similarity to Assist in the Article Screening Process for Systematic Reviews , 2015, JMIR medical informatics.

[2]  S. P. Wright,et al.  Adjusted P-values for simultaneous inference , 1992 .

[3]  Hoa A. Nguyen,et al.  A Cluster-Based Approach for Semantic Similarity in the Biomedical Domain , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[4]  Cynthia Brandt,et al.  Semantic similarity in the biomedical domain: an evaluation across knowledge sources , 2012, BMC Bioinformatics.

[5]  Kevin Donnelly,et al.  SNOMED-CT: The advanced terminology and coding system for eHealth. , 2006, Studies in health technology and informatics.

[6]  Claire Stansfield,et al.  Reducing systematic review workload using text mining: opportunities and pitfalls , 2015 .

[7]  G. Aghila,et al.  COSS: Cross Ontology Semantic Similarity measure — An information content based approach , 2011, 2011 International Conference on Recent Trends in Information Technology (ICRTIT).

[8]  Zachary Munn,et al.  Now that we’re here, where are we? The JBI approach to evidence-based healthcare 20 years on , 2015, International journal of evidence-based healthcare.

[9]  Cynthia Brandt,et al.  Ontology-guided feature engineering for clinical text classification , 2012, J. Biomed. Informatics.

[10]  Ted Pedersen,et al.  UMLS-Interface and UMLS-Similarity : Open Source Software for Measuring Paths and Semantic Similarity , 2009, AMIA.

[11]  José M. García,et al.  High-Throughput parallel blind Virtual Screening using BINDSURF , 2012, BMC Bioinformatics.

[12]  P. Glasziou,et al.  Systematic review automation technologies , 2014, Systematic Reviews.

[13]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[14]  Vladimir A. Oleshchuk,et al.  Ontology based semantic similarity comparison of documents , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[15]  Laxmaiah Manchikanti,et al.  Evidence-based medicine, systematic reviews, and guidelines in interventional pain management, part I: introduction and general considerations. , 2008, Pain physician.

[16]  Aaron M. Cohen,et al.  SYRIAC: The SYstematic Review Information Automated Collection System A Data Warehouse for Facilitating Automated Biomedical Text Classification , 2008, AMIA.

[17]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[18]  Sophia Ananiadou,et al.  Reducing systematic review workload through certainty-based screening , 2014, J. Biomed. Informatics.

[19]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[20]  Alexander Tsertsvadze,et al.  How to conduct systematic reviews more expeditiously? , 2015, Systematic Reviews.

[21]  K. Shojania,et al.  Systematic reviews can be produced and published faster. , 2008, Journal of clinical epidemiology.

[22]  C. Reynolds,et al.  Comparisons of methods for multiple hypothesis testing in neuropsychological research. , 2009, Neuropsychology.

[23]  Aaron M. Cohen,et al.  Optimizing Feature Representation for Automated Systematic Review Work Prioritization , 2008, AMIA.

[24]  Thusitha De Silva Mabotuwana,et al.  An ontology-based similarity measure for biomedical data - Application to radiology reports , 2013, J. Biomed. Informatics.

[25]  Stan Matwin,et al.  A new algorithm for reducing the workload of experts in performing systematic reviews , 2010, J. Am. Medical Informatics Assoc..

[26]  Oxford Centre for Evidence-based Medicine Levels of Evidence (January 2001) , 2014 .

[27]  Sule Gündüz Ögüdücü,et al.  A taxonomy based semantic similarity of documents using the cosine measure , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[28]  David Sánchez,et al.  Semantic similarity estimation from multiple ontologies , 2012, Applied Intelligence.

[29]  Carla E. Brodley,et al.  Semi-automated screening of biomedical citations for systematic reviews , 2010, BMC Bioinformatics.

[30]  Raghu Machiraju,et al.  Examining the Distribution, Modularity, and Community Structure in Article Networks for Systematic Reviews , 2015, AMIA.

[31]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[32]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[33]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement , 2015, Systematic Reviews.

[34]  David Sánchez,et al.  Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective , 2011, J. Biomed. Informatics.

[35]  Montserrat Batet,et al.  An information theoretic approach to improve semantic similarity assessments across multiple ontologies , 2014, Inf. Sci..

[36]  David Sánchez,et al.  An ontology-based measure to compute semantic similarity in biomedicine , 2011, J. Biomed. Informatics.

[37]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[38]  Carla E. Brodley,et al.  Deploying an interactive machine learning system in an evidence-based practice center: abstrackr , 2012, IHI '12.

[39]  Ahmed K. Elmagarmid,et al.  Learning to identify relevant studies for systematic reviews using random forest and external information , 2015, Machine Learning.

[40]  Hisham Al-Mubaid,et al.  Measuring Semantic Similarity Between Biomedical Concepts Within Multiple Ontologies , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[41]  Siddhartha Jonnalagadda,et al.  A new iterative method to reduce workload in systematic review process , 2013, Int. J. Comput. Biol. Drug Des..

[42]  H. Lowe,et al.  Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. , 1994, JAMA.

[43]  Mehrbakhsh Nilashi,et al.  Journal of Soft Computing and Decision Support Systems A Review of Semantic Similarity Measures in Biomedical Domain Using SNOMED-CT , 2015 .

[44]  William R. Hersh,et al.  Reducing workload in systematic review preparation using automated citation classification. , 2006, Journal of the American Medical Informatics Association : JAMIA.