Early classification of multivariate temporal observations by extraction of interpretable shapelets

BackgroundEarly classification of time series is beneficial for biomedical informatics problems such including, but not limited to, disease change detection. Early classification can be of tremendous help by identifying the onset of a disease before it has time to fully take hold. In addition, extracting patterns from the original time series helps domain experts to gain insights into the classification results. This problem has been studied recently using time series segments called shapelets. In this paper, we present a method, which we call Multivariate Shapelets Detection (MSD), that allows for early and patient-specific classification of multivariate time series. The method extracts time series patterns, called multivariate shapelets, from all dimensions of the time series that distinctly manifest the target class locally. The time series were classified by searching for the earliest closest patterns.ResultsThe proposed early classification method for multivariate time series has been evaluated on eight gene expression datasets from viral infection and drug response studies in humans. In our experiments, the MSD method outperformed the baseline methods, achieving highly accurate classification by using as little as 40%-64% of the time series. The obtained results provide evidence that using conventional classification methods on short time series is not as accurate as using the proposed methods specialized for early classification.ConclusionFor the early classification task, we proposed a method called Multivariate Shapelets Detection (MSD), which extracts patterns from all dimensions of the time series. We showed that the MSD method can classify the time series early by using as little as 40%-64% of the time series’ length.

[1]  Sumeet Dua,et al.  Temporal Pattern Mining for Multivariate Time Series Classification , 2011 .

[2]  Philip S. Yu,et al.  Extracting Interpretable Features for Early Classification on Time Series , 2011, SDM.

[3]  Claude Sammut,et al.  Classification of Multivariate Time Series and Structured Data Using Constructive Induction , 2005, Machine Learning.

[4]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[6]  Eamonn J. Keogh,et al.  Logical-shapelets: an expressive primitive for time series classification , 2011, KDD.

[7]  Michel Verleysen,et al.  Model Selection with Cross-Validations and Bootstraps - Application to Time Series Prediction with RBFN Models , 2003, ICANN.

[8]  Ziv Bar-Joseph,et al.  Alignment and classification of time series gene expression in clinical studies , 2008, ISMB.

[9]  Philip S. Yu,et al.  Early prediction on time series: a nearest neighbor approach , 2009, IJCAI 2009.

[10]  K. W. Cattermole The Fourier Transform and its Applications , 1965 .

[11]  Milos Hauskrecht,et al.  Constructing classification features using minimal predictive patterns , 2010, CIKM '10.

[12]  P. Ramadge,et al.  Discrete-time multivariable adaptive control , 1979 .

[13]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[14]  George E. P. Box,et al.  Time Series Analysis: Box/Time Series Analysis , 2008 .

[15]  L. Carin,et al.  Gene expression signatures diagnose influenza and other symptomatic respiratory viral infections in humans. , 2009, Cell host & microbe.

[16]  Arnold O. Allen,et al.  Probablity, Statistics and Queueing Theory with Computer Science Applications, Second Edition , 1990, Int. CMG Conference.

[17]  L. Greller,et al.  Transcription-Based Prediction of Response to IFNβ Using Supervised Computational Methods , 2004, PLoS biology.

[18]  Alexander Schliep,et al.  Constrained mixture estimation for analysis and robust classification of clinical time series , 2009, Bioinform..

[19]  Graham Goodwin,et al.  Discrete time multivariable adaptive control , 1979, 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.