Analyzing trends by symbolic episode representation and sequence alignment

Data analysis is often associated with quantitative techniques because of the large amount of data and easy-to-use statistical tools. Qualitative trend analysis (QTA) techniques always have to be guided with some data reduction method, e.g. principal component analysis (PCA) or segmentation, and the preprocessed, lowered size data can be analyzed for further aims. Derivative-based segmentation methods are presented which are popular in fault diagnosis. If there is an adequate distance measure, one is able to qualify, compare or classify different time series. This article proposes segmentation-based alignment techniques based on dynamic distance measure: time warping (DTW) and a developed one, which uses pairwise sequence alignment -a common tool in bioinformatics -to align triangular episode sequences. Both techniques highly depend on the pre-defined distance or similarity measure between the trends because they try to find the minimal distance or maximal similarity path. These two techniques are compared and qualified on handwriting data based case study. It has been shown that symbolic episode segmentation based sequence alignment aided by prior knowledge of the operators can handle qualitative trend analysis and thus it is able to monitor and qualify operating processes.

[1]  Michel Verhaegen,et al.  ECG Segmentation Using Time-Warping , 1997, IDA.

[2]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[3]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[4]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[5]  Joaquím Meléndez,et al.  Predicting aerodynamic instabilities in a blast furnace , 2006, Eng. Appl. Artif. Intell..

[6]  W. A. Beyer,et al.  Some Biological Sequence Metrics , 1976 .

[7]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[8]  R. Manmatha,et al.  Word image matching using dynamic time warping , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Sylvie Charbonnier,et al.  Trends extraction and analysis for complex system monitoring and decision support , 2005, Eng. Appl. Artif. Intell..

[10]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[11]  Ahmet Palazoglu,et al.  Classification of process trends based on fuzzified symbolic representation and hidden Markov models , 1998 .

[12]  P. Sellers On the Theory and Computation of Evolutionary Distances , 1974 .

[13]  G. Stephanopoulos,et al.  Representation of process trends—Part I. A formal representation framework , 1990 .

[14]  Rajagopalan Srinivasan,et al.  Online fault diagnosis and state identification during process transitions using dynamic locus analysis , 2006 .

[15]  Rajagopalan Srinivasan,et al.  Monitoring transitions in chemical plants using enhanced trend analysis , 2003, Comput. Chem. Eng..

[16]  Michael S. Waterman,et al.  General methods of sequence comparison , 1984 .

[17]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[18]  Eamonn J. Keogh,et al.  Scaling up Dynamic Time Warping to Massive Dataset , 1999, PKDD.