论文信息 - Textual Approximation Methods for Time Series Classification: TAX and l-TAX

Textual Approximation Methods for Time Series Classification: TAX and l-TAX

SUMMARY A lot of work has been conducted on time series classification and similarity search over the past decades. However, the classification of a time series with high accuracy is still insufficient in applications such as ubiquitous or sensor systems. In this paper, a novel textual approximation of a time series, called TAX, is proposed to achieve high accuracy time series classification. l-TAX, an extended version of TAX that shows promising classification accuracy over TAX and other existing methods, is also proposed. We also provide a comprehensive comparison between TAX and l-TAX, and discuss the benefits of both methods. Both TAX and l-TAX transform a time series into a textual structure using existing document retrieval methods and bioinformatics algorithms. In TAX, a time series is represented as a document like structure, whereas l-TAX used a sequence of textual symbols. This paper provides a comprehensive overview of the textual approximation and techniques used by TAX and l-TAX

Hung-Hsuan Huang | Kyoji Kawagoe | Abdulla-Al-Maruf

[1] Eamonn J. Keogh,et al. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[2] Eamonn J. Keogh,et al. A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[3] Qiang Wang,et al. A dimensionality reduction technique for efficient time series similarity analysis , 2008, Inf. Syst..

[4] Sean R. Eddy,et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[5] Hung-Hsuan Huang,et al. Time Series Classification Method Based on Longest Common Subsequence and Textual Approximation , 2012, Seventh International Conference on Digital Information Management (ICDIM 2012).

[6] J. Kurths,et al. Quantitative analysis of heart rate variability. , 1995, Chaos.

[7] Hans-Peter Kriegel,et al. Similarity Search on Time Series Based on Threshold Queries , 2006, EDBT.

[8] Lei Chen,et al. Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[9] Jignesh M. Patel,et al. An efficient and accurate method for evaluating time series similarity , 2007, SIGMOD '07.

[10] Christos Faloutsos,et al. Efficient Similarity Search In Sequence Databases , 1993, FODO.

[11] Mathias Baumert,et al. Short- and long-term joint symbolic dynamics of heart rate and blood pressure in dilated cardiomyopathy , 2005, IEEE Transactions on Biomedical Engineering.

[12] Eamonn J. Keogh,et al. A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases , 2000, PAKDD.

[13] L. Bergroth,et al. A survey of longest common subsequence algorithms , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[14] Jon M. Kleinberg,et al. Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[15] Dah-Jye Lee,et al. Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining , 2006, Sixth International Conference on Data Mining (ICDM'06).

[16] Christos Faloutsos,et al. Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[17] D Sankoff,et al. Matching sequences under deletion-insertion constraints. , 1972, Proceedings of the National Academy of Sciences of the United States of America.

[18] Pierre-François Marteau,et al. Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] F. Wendling,et al. Extraction of spatio-temporal signatures from depth EEG seizure signals based on objective matching in warped vectorial observations , 1996, IEEE Transactions on Biomedical Engineering.

[20] Huey-Wen Yien,et al. Linguistic analysis of the human heartbeat using frequency and rank order statistics. , 2003, Physical review letters.

[21] Keun Ho Ryu,et al. Multivariable stream data classification using motifs and their temporal relations , 2009, Inf. Sci..

[22] Li Wei,et al. Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[23] Eamonn J. Keogh,et al. Towards parameter-free data mining , 2004, KDD.

[24] Durbin,et al. Biological Sequence Analysis , 1998 .

[25] Xiaofang Zhou,et al. Searching time series using textual approximation , 2011 .

[26] Smruti R. Sarangi,et al. DUST: a generalized notion of similarity between uncertain time series , 2010, KDD.

[27] Dimitrios Gunopulos,et al. Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[28] Eamonn J. Keogh,et al. Visualizing and Discovering Non-Trivial Patterns in Large Time Series Databases , 2005, Inf. Vis..

[29] C. Finney,et al. A review of symbolic analysis of experimental data , 2003 .

[30] Eamonn J. Keogh,et al. Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[31] Hagit Shatkay,et al. Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[32] Lei Chen,et al. On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[33] Dimitrios Gunopulos,et al. Elastic Translation Invariant Matching of Trajectories , 2005, Machine Learning.

[34] Mark S. Nixon,et al. Feature Extraction and Image Processing , 2002 .