Discovering Quantitative Temporal Functional Dependencies on Clinical Data

Approximate functional dependencies, even with suitable temporal extensions, have been recently proposed as a methodological tool for mining clinical data. It allows healthcare stakeholders to derive new knowledge from overwhelming amount of healthcare and clinical data. Some examples of the kind of knowledge derivable from data through dependencies may be "month by month, patients with the same symptoms get the same type of therapy" or "within 15 days, patients with the same diagnosis and the same therapy receive the same daily amount of drug". The main limitation of such kind of dependencies is that they cannot deal with quantitative data, when some tolerance can be allowed for numerical values. In particular, such limitation arises in clinical data warehouses, where analysis and mining have to consider one or more measures (related to quantitative data as lab test results, vital signs as blood pressures, temperature and so on), with respect to many dimensional (alphanumeric) attributes (as patient, hospital, physician, diagnosis) and to some time dimensions (as the day since hospitalization, the calendar date, and so on). According to this scenario, we introduce here a new kind of approximate temporal functional dependency, named multi approximate temporal functional dependency (MATFD), which consider dependencies between dimensions and quantitative measures from temporal clinical data. Such new dependencies may provide new knowledge as "within 15 days, patients with the same diagnosis and the same therapy receive a daily amount of drug within a fixed range". Moreover, we provide an original algorithm to mine such kind of dependencies and to derive some core dependencies, both for the discovered temporal window and for the involved dimensional attributes. Finally, we discuss some first results we obtained by pre-processing and mining ICU data from MIMIC III database.

[1]  E. F. Codd,et al.  Normalized data base structure: a brief tutorial , 1971, SIGFIDET '71.

[2]  Carlo Combi,et al.  Modeling and Querying Temporal Semistructured Data , 2009, New Trends in Data Warehousing and Data Analysis.

[3]  Angelo Montanari,et al.  A Uniform Framework for Temporal Functional Dependencies with Multiple Granularities , 2011, SSTD.

[4]  Carlo Combi,et al.  Data mining with Temporal Abstractions: learning rules from time series , 2007, Data Mining and Knowledge Discovery.

[5]  Pietro Sala,et al.  Mining approximate interval-based temporal dependencies , 2015, Acta Informatica.

[6]  Sushil Jajodia,et al.  Time Granularities in Databases, Data Mining, and Temporal Reasoning , 2000, Springer Berlin Heidelberg.

[7]  Carlo Combi,et al.  Extraction, Analysis, and Visualization of Temporal Association Rules from Interval-Based Clinical Data , 2013, AIME.

[8]  Mor Peleg,et al.  Artificial Intelligence in Medicine AIME 2013 , 2015, Artif. Intell. Medicine.

[9]  Arthur J. Davidson,et al.  Clinical research data warehouse governance for distributed research networks in the USA: a systematic review of the literature , 2014, J. Am. Medical Informatics Assoc..

[10]  Angelo Montanari,et al.  The t4sql temporal query language , 2007, CIKM '07.

[11]  Pietro Sala,et al.  The Price of Evolution in Temporal Databases , 2015, 2015 22nd International Symposium on Temporal Representation and Reasoning (TIME).

[12]  Carlo Combi,et al.  Querying temporal clinical databases on granular trends , 2012, J. Biomed. Informatics.

[13]  Jef Wijsen,et al.  Temporal Dependencies , 2009, Encyclopedia of Database Systems.

[14]  Jef Wijsen,et al.  Trends in Databases: Reasoning and Mining , 2001, IEEE Trans. Knowl. Data Eng..

[15]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[16]  Carlo Combi,et al.  Designing the reconciled schema for a pharmacovigilance data warehouse through a temporally-enhanced ER model , 2012, SHB '12.

[17]  Pietro Sala,et al.  Mining approximate temporal functional dependencies with pure temporal grouping in clinical databases , 2015, Comput. Biol. Medicine.

[18]  C Combi,et al.  Temporal reasoning and temporal data maintenance in medicine: Issues and challenges , 1997, Comput. Biol. Medicine.

[19]  Michael Marschollek,et al.  Automated population of an i2b2 clinical data warehouse from an openEHR-based data repository , 2016, J. Biomed. Informatics.

[20]  Christian S. Jensen,et al.  Extending Existing Dependency Theory to Temporal Databases , 1996, IEEE Trans. Knowl. Data Eng..

[21]  E. F. Codd,et al.  Proceedings of the 1971 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control , 1970 .

[22]  Pietro Sala,et al.  A Framework for Mining Evolution Rules and Its Application to the Clinical Domain , 2015, 2015 International Conference on Healthcare Informatics.

[23]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[24]  Heikki Mannila,et al.  Approximate Inference of Functional Dependencies from Relations , 1995, Theor. Comput. Sci..