Patient Identification for Clinical Trials with Ontology-based Information Extraction from Documents

In this paper, we describe the use of ontologies in the context of a system for recruiting patients for clinical trials, which is currently being tested at the Charité – Universitätsmedizin Berlin, one of the largest university hospitals in Europe. The main purpose of the CRDW (Clinical Research Data Warehouse) is to support patient recruitment for clinical trials based on routine data from the hospital’s clinical information system (CIS). In contrast to most other systems for similar purposes, the CRDW also makes use of information that is present in clinical documents like admission reports, radiological findings, and discharge letters. The linguistic analysis recognizes negated and coordinated phrases. It is supported by clinical domain ontologies that enable the identification of main terms and their properties, as well as semantic search with synonyms, hypernyms, and syntactic variants. The focus of this paper is the description of our ontology model, which we tailored to the particular requirements of our application. In the article, we will also provide an evaluation of the system based on experimental data obtained from the daily routine work of the study assistants.

[1]  Michael Kifer,et al.  Logical foundations of object-oriented and frame-based languages , 1995, JACM.

[2]  Hyoil Han,et al.  Survey of semantic annotation platforms , 2005, SAC '05.

[3]  Jack Minker,et al.  Logic and Databases: A Deductive Approach , 1984, CSUR.

[4]  Michael Kifer Rule Interchange Format: The Framework , 2008, RR.

[5]  Frank Henrik Müller,et al.  A finite-state approach to shallow parsing and grammatical functions annotation of German , 2005 .

[6]  José L. V. Mejino,et al.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy , 2003, J. Biomed. Informatics.

[7]  Maria T. Pazienza,et al.  Information Extraction , 2002, Lecture Notes in Computer Science.

[8]  Christian Lovis,et al.  Automatic medical encoding with SNOMED categories , 2008, BMC Medical Informatics Decis. Mak..

[9]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[10]  Kai-Uwe Kühnberger,et al.  Mining Concept Similarities for Heterogeneous Ontologies , 2010, ICDM.

[11]  Isaac S. Kohane,et al.  Integration of Clinical and Genetic Data in the i2b2 Architecture , 2006, AMIA.

[12]  Ken Schwaber,et al.  Agile Software Development with SCRUM , 2001 .

[13]  Paul Browne,et al.  JBoss Drools Business Rules , 2009 .

[14]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[15]  Martin Dugas,et al.  Workflow to improve patient recruitment for clinical trials within hospital information systems – a case-study , 2008, Trials.

[16]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[17]  John Wylie Lloyd,et al.  Foundations of Logic Programming , 1987, Symbolic Computation.

[18]  Axel Polleres,et al.  From SPARQL to rules (and back) , 2007, WWW '07.

[19]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[20]  Steffen Staab,et al.  International Handbooks on Information Systems , 2013 .

[21]  Matthias Endres,et al.  Troponin elevation in acute ischemic stroke (TRELAS) - protocol of a prospective observational trial , 2011, BMC neurology.