A data mining framework for time series estimation

Time series estimation techniques are usually employed in biomedical research to derive variables less accessible from a set of related and more accessible variables. These techniques are traditionally built from systems modeling approaches including simulation, blind decovolution, and state estimation. In this work, we define target time series (TTS) and its related time series (RTS) as the output and input of a time series estimation process, respectively. We then propose a novel data mining framework for time series estimation when TTS and RTS represent different sets of observed variables from the same dynamic system. This is made possible by mining a database of instances of TTS, its simultaneously recorded RTS, and the input/output dynamic models between them. The key mining strategy is to formulate a mapping function for each TTS-RTS pair in the database that translates a feature vector extracted from RTS to the dissimilarity between true TTS and its estimate from the dynamic model associated with the same TTS-RTS pair. At run time, a feature vector is extracted from an inquiry RTS and supplied to the mapping function associated with each TTS-RTS pair to calculate a dissimilarity measure. An optimal TTS-RTS pair is then selected by analyzing these dissimilarity measures. The associated input/output model of the selected TTS-RTS pair is then used to simulate the TTS given the inquiry RTS as an input. An exemplary implementation was built to address a biomedical problem of noninvasive intracranial pressure assessment. The performance of the proposed method was superior to that of a simple training-free approach of finding the optimal TTS-RTS pair by a conventional similarity-based search on RTS features.

[1]  Xiao Hu,et al.  A Data mining framework of noninvasive intracranial pressure assessment , 2006, Biomed. Signal Process. Control..

[2]  Patrick Dewilde,et al.  Subspace model identification Part 1. The output-error state-space model identification class of algorithms , 1992 .

[3]  Riccardo Bellazzi,et al.  Precedence Temporal Networks to represent temporal relationships in gene expression data , 2007, J. Biomed. Informatics.

[4]  Wallace E. Larimore,et al.  Canonical variate analysis in identification, filtering, and adaptive control , 1990, 29th IEEE Conference on Decision and Control.

[5]  Frank Höppner Discovery of Temporal Patterns. Learning Rules about the Qualitative Behaviour of Time Series , 2001, PKDD.

[6]  J. Sprott Chaos and time-series analysis , 2001 .

[7]  Xiao Hu,et al.  Time Series Mining Approach for Noninvasive Intracranial Pressure Assessment: An Investigation of Different Regularization Techniques , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[8]  R. Panerai,et al.  Linear and nonlinear analysis of human dynamic cerebral autoregulation. , 1999, American journal of physiology. Heart and circulatory physiology.

[9]  D. Kass,et al.  Parametric model derivation of transfer function for noninvasive estimation of aortic pressure by radial tonometry , 1999, IEEE Transactions on Biomedical Engineering.

[10]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[11]  Shusaku Tsumoto,et al.  Mining similar temporal patterns in long time-series data and its application to medicine , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[12]  B. Fetics,et al.  Estimation of Central Aortic Pressure Waveform by Mathematical Transformation of Radial Tonometry Pressure Data , 1998 .

[13]  Richard W. Jones,et al.  Computerised anaesthesia monitoring using fuzzy trend templates , 2001, Artif. Intell. Medicine.

[14]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[15]  Jason R. Chen Making subsequence time series clustering meaningful , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[16]  Eamonn J. Keogh,et al.  Clustering of streaming time series is meaningless , 2003, DMKD '03.

[17]  Riccardo Bellazzi,et al.  Temporal data mining for the quality assessment of hemodialysis services , 2005, Artif. Intell. Medicine.

[18]  Heikki Mannila,et al.  Rule Discovery from Time Series , 1998, KDD.

[19]  Bart De Moor,et al.  Subspace Identification for Linear Systems: Theory ― Implementation ― Applications , 2011 .

[20]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[21]  C. H. Chen,et al.  Estimation of central aortic pressure waveform by mathematical transformation of radial tonometry pressure. Validation of generalized transfer function. , 1997, Circulation.

[22]  Atul J. Butte,et al.  Comparing the Similarity of Time-Series Gene Expression Using Signal Processing Metrics , 2001, J. Biomed. Informatics.

[23]  Xiao Hu,et al.  Estimation of Hidden State Variables of the Intracranial System Using Constrained Nonlinear Kalman Filters , 2007, IEEE Transactions on Biomedical Engineering.

[24]  Jessica Lin,et al.  Visually mining and monitoring massive time series , 2004, KDD.

[25]  Richard W. Jones,et al.  Diagnostic monitoring in anaesthesia using fuzzy trend templates for matching temporal patterns , 1999, Artif. Intell. Medicine.

[26]  Philip Chan,et al.  Learning States and Rules for Detecting Anomalies in Time Series , 2005, Applied Intelligence.

[27]  R. Mukkamala,et al.  Blind identification of the aortic pressure waveform from multiple peripheral artery pressure waveforms. , 2007, American journal of physiology. Heart and circulatory physiology.

[28]  Kuniaki Uehara,et al.  Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle , 2005, Machine Learning.

[29]  Catherine Garbay,et al.  Knowledge construction from time series data using a collaborative exploration system , 2007, J. Biomed. Informatics.