Semantic Knowledge Base Construction from Radiology Reports

The tremendous quantity of data stored daily in healthcare institutions demands the development of new methods to summarize and reuse available information in clinical practice. In order to leverage modern healthcare information systems, new strategies must be developed that address challenges such as extraction of relevant information, data redundancy, and the lack of associations within the data. This article proposes a pipeline to overcome these challenges in the context of medical imaging reports, by automatically extracting and linking information, and summarizing natural language reports into an ontology model. Using data from the Physionet MIMIC II database, we created a semantic knowledge base with more than 6.5 millions of triples obtained from a collection of 16,000 radiology reports.

[1]  Asuman Dogac,et al.  Providing Semantic Interoperability Between Clinical Care and Clinical Research Domains , 2013, IEEE Journal of Biomedical and Health Informatics.

[2]  Pedro Lopes,et al.  A semantic web application framework for health systems interoperability , 2011, MIXHS '11.

[3]  Dietrich Rebholz-Schuhmann,et al.  Text processing through Web services: calling Whatizit , 2008, Bioinform..

[4]  José Luís Oliveira,et al.  A Semantic Layer for Unifying and Exploring Biomedical Document Curation Results , 2015, IWBBIO.

[5]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[6]  José Luís Oliveira,et al.  Semantic Search over DICOM Repositories , 2014, 2014 IEEE International Conference on Healthcare Informatics.

[7]  Ted Kwartler The OpenNLP Project , 2017 .

[8]  Christopher G. Chute,et al.  Using Semantic Web Technologies for Cohort Identification from Electronic Health Records for Clinical Research , 2012, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[9]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[10]  T. H. Kyaw,et al.  Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database* , 2011, Critical care medicine.

[11]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[12]  Cheng Thao,et al.  GoldMiner: a radiology image search engine. , 2007, AJR. American journal of roentgenology.

[13]  Cui Tao,et al.  Time-Oriented Question Answering from Clinical Narratives Using Semantic-Web Techniques , 2010, SEMWEB.

[14]  José Luís Oliveira,et al.  COEUS: “semantic web in a box” for biomedical applications , 2012, Journal of Biomedical Semantics.

[15]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[16]  N. Shah,et al.  NCBO Annotator: Semantic Annotation of Biomedical Data , 2009 .

[17]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[18]  Kent A. Spackman,et al.  SNOMED clinical terms: overview of the development process and project status , 2001, AMIA.

[19]  Christopher G. Chute,et al.  CNTRO 2.0: A Harmonized Semantic Web Ontology for Temporal Relation Inferencing in Clinical Narratives , 2011, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[20]  José Luís Oliveira,et al.  Gimli: open source and high-performance biomedical name recognition , 2013, BMC Bioinformatics.

[21]  José Emilio Labra Gayo,et al.  SeDeLo: Using Semantics and Description Logics to Support Aided Clinical Diagnosis , 2012, Journal of Medical Systems.

[22]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.