Time-series as Background Data for Relating Medical Diagnoses Terms

Relating terms from different ontologies or identifying the most relevant entry in an ontology for a given term is an important task in various settings involving the use of ontologies. Often the task of relating terms is achieved by considering the instance level matchings within the ontologies being aligned, or using an external common ontology for indirect linking, or using annotated text corpus. In this paper, we focus on a variant of this problem that occurs in relating medical diagnosis terms. We propose a novel unsupervised approach that exploits the availability of time-series data medical events of patients during their stay in the intensive care unit (ICU). Our method, called DECREE, is evaluated using a large-scale real-world medical repository of ICU data including event data from laboratory test measurements to quantify the relationship strength between terms from a given ontology. We further outline how DECREE can be used to assign diagnoses terms in case of unlabeled pathology as well. We show that DECREE can discover better quality relationships and is more scalable than state-of-the-art time-series techniques.

[1]  X.S. Wang,et al.  Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences , 1998, IEEE Trans. Knowl. Data Eng..

[2]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[3]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[4]  Milos Hauskrecht,et al.  Mining recent temporal patterns for event detection in multivariate time series data , 2012, KDD.

[5]  Bowen Zhou,et al.  Medical Synonym Extraction with Concept Space Models , 2015, IJCAI.

[6]  Juan Alfonso Lara,et al.  A general framework for time series data mining based on event analysis: Application to the medical domains of electroencephalography and stabilometry , 2014, J. Biomed. Informatics.

[7]  Cong Yu,et al.  Dynamic relationship and event discovery , 2011, WSDM '11.

[8]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[9]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[10]  Fei Wang,et al.  Survey on distance metric learning and dimensionality reduction in data mining , 2014, Data Mining and Knowledge Discovery.

[11]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[12]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[13]  A. Rabinstein,et al.  Neurologic presentations of acid-base imbalance, electrolyte abnormalities, and endocrine emergencies. , 2010, Neurologic clinics.

[14]  Jérôme Euzenat,et al.  A Survey of Schema-Based Matching Approaches , 2005, J. Data Semant..