Mining patterns in long sequential data with noise

Pattern discovery in time series data has been a problem of great importance in many fields, e.g., computational biology, performance analysis, consumer behavior, etc. Recently, considerable amount of research has been carried out in this area. The facts that the input data is typically very large and noises may present in various formats pose great challenge to the mining process. Recently, we have made several new research advances in this area. In this paper, we present some of them. We will survey new models proposed to address different types of noises as well as scalable algorithms developed for efficiently mining patterns under each model.

[1]  Jiawei Han,et al.  Mining Segment-Wise Periodic Patterns in Time-Related Databases , 1998, KDD.

[2]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[3]  Philip S. Yu,et al.  Meta-patterns: revealing hidden periodic patterns , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[4]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[5]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[6]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[7]  Richard E. Blahut,et al.  Principles and practice of information theory , 1987 .

[8]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[9]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[10]  Kyuseok Shim,et al.  SPIRIT: Sequential Pattern Mining with Regular Expression Constraints , 1999, VLDB.

[11]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[12]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[13]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[14]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[15]  Philip S. Yu,et al.  Mining asynchronous periodic patterns in time series data , 2000, KDD '00.