Entropy-Based Symbolic Representation for Time Series Classification

In order to improve the performance of time-series classification, we introduce a new approach of time series classification. The first basic idea of the approach is to use entropy impurity measure to discretize and symbolize time series, which discretize the original time series into disjoint intervals using entropy impurity measure and then transform the time series into symbolic representations. The second idea of the approach is to combine symbolic representation of time series and k nearest neighbor to classify time series. The proposed approach is compared with a number of known pattern classifiers by benchmarking with the use of artificial and real-world data sets. The experimental results show it can reduce the error rates of time series classification, so it is highly competitive with previous approaches.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Cláudia Antunes,et al.  Temporal Data Mining: an overview , 2001 .

[3]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[4]  Mohamed S. Kamel,et al.  Design of Multiple Classifier Systems for Time Series Data , 2005, Multiple Classifier Systems.

[5]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[6]  Lei Chen,et al.  Using Multi-Scale Histograms to Answer Pattern Existence and Shape Match Queries , 2005, SSDBM.

[7]  Pierre Geurts,et al.  Pattern Extraction for Time Series Classification , 2001, PKDD.

[8]  Yannis Manolopoulos,et al.  Feature-based classification of time-series data , 2001 .

[9]  Mohammed Waleed Kadous,et al.  Learning Comprehensible Descriptions of Multivariate Time Series , 1999, ICML.

[10]  Stephen J. Roberts,et al.  Bayesian time series classification , 2001, NIPS.

[11]  Alonso Gonzalez,et al.  A Graphical Rule Language for Continuous Dynamic Systems , 1999 .

[12]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[13]  Tommi S. Jaakkola,et al.  A new approach to analyzing gene expression time series data , 2002, RECOMB '02.

[14]  R. J. Alcock Time-Series Similarity Queries Employing a Feature-Based Approach , 1999 .

[15]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[16]  Usama M. Fayyad,et al.  On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.

[17]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[18]  Henrik Boström,et al.  Learning First Order Logic Time Series Classifiers: Rules and Boosting , 2000, PKDD.

[19]  G. W. Hughes,et al.  Minimum Prediction Residual Principle Applied to Speech Recognition , 1975 .