Combining open‐source natural language processing tools to parse clinical practice guidelines

Natural language processing NLP has been used to process text pertaining to patient records and narratives. However, most of the methods used were developed for specific systems, so new research is necessary to assess whether such methods can be easily retargeted for new applications and goals, with the same performance. In this paper, open-source tools are reused as building blocks on which a new system is built. The aim of our work is to evaluate the applicability of the current NLP technology to a new domain: automatic knowledge acquisition of diagnostic and therapeutic procedures from clinical practice guideline free-text documents. In order to do this, two publicly available syntactic parsers, several terminology resources and a tool oriented to identify semantic predications were tailored to increase the performance of each tool individually. We apply this new approach to 171 sentences selected by the experts from a clinical guideline, and compare the results with those of the tools applied with no tailoring. The results of this paper show that with some adaptation, open-source NLP tools can be retargeted for new tasks, providing an accuracy that is equivalent to the methods designed for specific tasks.

[1]  K. Bretonnel Cohen,et al.  Current issues in biomedical text mining and natural language processing , 2009, J. Biomed. Informatics.

[2]  Angus Roberts,et al.  Building a semantically annotated corpus of clinical texts , 2009, J. Biomed. Informatics.

[3]  D. Lindberg,et al.  The Unified Medical Language System , 1993, Methods of Information in Medicine.

[4]  George Hripcsak,et al.  Automated encoding of clinical documents based on natural language processing. , 2004, Journal of the American Medical Informatics Association : JAMIA.

[5]  Arie Hasman,et al.  Approaches for creating computer-interpretable guidelines that facilitate decision support , 2004, Artif. Intell. Medicine.

[6]  Katharina Kaiser,et al.  Easing semantically enriched information retrieval - An interactive semi-automatic annotation system for medical documents , 2010, Int. J. Hum. Comput. Stud..

[7]  Thomas C. Rindflesch,et al.  Natural Language Processing , 1996, Annual Review of Applied Linguistics.

[8]  Mor Peleg,et al.  Design patterns for clinical guidelines , 2009, Artif. Intell. Medicine.

[9]  Craig Batty,et al.  Case Study 4 , 2011 .

[10]  Yuval Shahar,et al.  AsbruView: Capturing Complex, Time-Oriented Plans - Beyond Flow Charts , 2002, Diagrammatic Representation and Reasoning.

[11]  Frank van Harmelen,et al.  Extraction and use of linguistic patterns for modelling medical guidelines , 2007, Artif. Intell. Medicine.

[12]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[13]  Yang Jin,et al.  Automated recognition of malignancy mentions in biomedical literature , 2006, BMC Bioinformatics.

[14]  Antonio Moreno,et al.  Computer-based execution of clinical guidelines: A review , 2008, Int. J. Medical Informatics.

[15]  W W Rosser,et al.  Promoting effective guideline use in Ontario. , 2001, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[16]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[17]  Dan Klein,et al.  Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon , 2005 .

[18]  K Denecke,et al.  Semantic Structuring of and Information Extraction from Medical Documents Using the UMLS , 2008, Methods of Information in Medicine.

[19]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[20]  Clement J. McDonald,et al.  What can natural language processing do for clinical decision support? , 2009, J. Biomed. Informatics.

[21]  Rosario Laland,et al.  An Automated Approach to Mapping External Terminologies to the UMLS , 2009 .

[22]  A. Hoes,et al.  Guidelines for the diagnosis and treatment of chronic heart failure: executive summary (update 2005): The Task Force for the Diagnosis and Treatment of Chronic Heart Failure of the European Society of Cardiology. , 2005, European heart journal.

[23]  Katharina Kaiser,et al.  How can information extraction ease formalizing treatment processes in clinical practice guidelines?: A method and its evaluation , 2007, Artif. Intell. Medicine.

[24]  Michael Krauthammer,et al.  Term identification in the biomedical literature , 2004, J. Biomed. Informatics.

[25]  Henrik Eriksson,et al.  The evolution of Protégé: an environment for knowledge-based systems development , 2003, Int. J. Hum. Comput. Stud..