Hidden markov model-based time series prediction using motifs for detecting inter-time-serial correlations

This paper presents an approach for time series prediction using a Hidden Markov Model, which bases on inter-time-serial correlations. These correlations between time series of a given database are automatically discovered by hierarchically clustering motif-based time series representations, which can be used for the prediction of the future development of one time series on base of known values from the one and correlated time series. The functionality and the influence of the different parameters of the motif-based representation, the inter-time-serial correlation discovery and the prediction capability are evaluated on two large databases of river level measurements and stock data.

[1]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[2]  Lars Schmidt-Thieme,et al.  Motif-Based Classification of Time Series with Bayesian Networks and SVMs , 2008, GfKl.

[3]  Eamonn J. Keogh,et al.  Detecting time series motifs under uniform scaling , 2007, KDD '07.

[4]  Wolfgang Gaul,et al.  Mining generalized association rules for sequential and path data , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[5]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[6]  Arpad Gellert,et al.  Person Movement Prediction Using Hidden Markov Models , 2006 .

[7]  A. Poritz,et al.  Hidden Markov models: a guided tour , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[8]  Stefan Conrad,et al.  An approach for automatic sleep stage scoring and apnea-hypopnea detection , 2010, 2010 IEEE International Conference on Data Mining.

[9]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[10]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[11]  B. Tirozzi,et al.  Time series analysis of geological data , 1999 .

[12]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[13]  Ferenc Bodon,et al.  A trie-based APRIORI implementation for mining frequent item sequences , 2005 .

[14]  F. Mormann,et al.  Seizure prediction: the long and winding road. , 2007, Brain : a journal of neurology.

[15]  A. Schlögl,et al.  An E-Health Solution for Automatic Sleep Classification according to Rechtschaffen and Kales: Validation Study of the Somnolyzer 24 × 7 Utilizing the Siesta Database , 2005, Neuropsychobiology.

[16]  Eamonn J. Keogh,et al.  Mining motifs in massive time series databases , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[17]  Stan Salvador,et al.  FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space , 2004 .

[18]  Paulo J. Azevedo,et al.  Protein Sequence Classification Through Relevant Sequence Mining and Bayes Classifiers , 2005, EPIA.

[19]  Tak-Chung Fu,et al.  Stock time series pattern matching: Template-based vs. rule-based approaches , 2007, Eng. Appl. Artif. Intell..

[20]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[21]  Geoffrey I. Webb Discovering Significant Patterns , 2007, Machine Learning.

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[23]  S. Kotsiantis,et al.  Discretization Techniques: A recent survey , 2006 .

[24]  Paulo J. Azevedo,et al.  Mining Approximate Motifs in Time Series , 2006, Discovery Science.

[25]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[26]  Eamonn J. Keogh,et al.  Iterative Deepening Dynamic Time Warping for Time Series , 2002, SDM.

[27]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[28]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[29]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[30]  Vincent Vandewalle,et al.  Statistical tests to compare motif count exceptionalities , 2007, BMC Bioinformatics.

[31]  C. Finney,et al.  A review of symbolic analysis of experimental data , 2003 .

[32]  Paulo J. Azevedo,et al.  Time Series Motifs Statistical Significance , 2011, SDM.

[33]  Philip A. Schrodt Early Warning of Conflict in Southern Lebanon using Hidden Markov Models , 1997 .

[34]  Eytan Ruppin,et al.  Motif extraction and protein classification , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[35]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[36]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[37]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[38]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[39]  Tong Xu,et al.  Long-Term Sunspot Number Prediction based on EMD Analysis and AR Model , 2008 .

[40]  Diego J. Pedregal,et al.  A non-linear forecasting system for the Ebro River at Zaragoza, Spain , 2009, Environ. Model. Softw..

[41]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .