A sequential pattern mining algorithm using rough set theory

Sequential pattern mining is a crucial but challenging task in many applications, e.g., analyzing the behaviors of data in transactions and discovering frequent patterns in time series data. This task becomes difficult when valuable patterns are locally or implicitly involved in noisy data. In this paper, we propose a method for mining such local patterns from sequences. Using rough set theory, we describe an algorithm for generating decision rules that take into account local patterns for arriving at a particular decision. To apply sequential data to rough set theory, the size of local patterns is specified, allowing a set of sequences to be transformed into a sequential information system. We use the discernibility of decision classes to establish evaluation criteria for the decision rules in the sequential information system.

[1]  Shusaku Tsumoto,et al.  A Clustering Method for Spatio-temporal Data and Its Application to Soccer Game Records , 2005, RSFDGrC.

[2]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[3]  Shusaku Tsumoto Characteristics of Accuracy and Coverage in Rule Induction , 2003, RSFDGrC.

[4]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[5]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[6]  Martin Holena,et al.  Measures of ruleset quality for general rules extraction methods , 2009, Int. J. Approx. Reason..

[7]  Guoyin Wang,et al.  Rough Sets, Fuzzy Sets, Data Mining and Granular Computing , 2011, Lecture Notes in Computer Science.

[8]  Zdzislaw Pawlak,et al.  Decision Rules and Dependencies , 2003, Fundam. Informaticae.

[9]  Zdzislaw Pawlak,et al.  Decision Rules, Bayes' Rule and Ruogh Sets , 1999, RSFDGrC.

[10]  Jan G. Bazan Hierarchical Classifiers for Complex Spatio-temporal Concepts , 2008, Trans. Rough Sets.

[11]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[12]  Yasuo Kudo,et al.  Local Pattern Mining from Sequences Using Rough Set Theory , 2010, 2010 IEEE International Conference on Granular Computing.

[13]  Philip S. Yu,et al.  Mining asynchronous periodic patterns in time series data , 2000, KDD '00.

[14]  Sourav S. Bhowmick,et al.  Sequential Pattern Mining: A Survey , 2003 .

[15]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[16]  Andrzej Skowron,et al.  Planning Based on Reasoning About Information Changes , 2006, RSCTC.

[17]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[18]  Zdzislaw Pawlak,et al.  Drawing conclusions from data - The rough set way , 2001, Int. J. Intell. Syst..

[19]  Andrzej Skowron,et al.  Hierarchical Information Maps , 2005, RSFDGrC.

[20]  Kyuseok Shim,et al.  Mining Sequential Patterns with Regular Expression Constraints , 2002, IEEE Trans. Knowl. Data Eng..

[21]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[22]  Z. Pawlak Drawing conclusions from data—The rough set way , 2001 .

[23]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[24]  Roman Słowiński,et al.  Intelligent Decision Support , 1992, Theory and Decision Library.

[25]  Tetsuya Murai,et al.  A Note on Characteristic Combination Patterns about How to Combine Objects in Object-Oriented Rough Set Models , 2008, RSKT.

[26]  Andrzej Skowron,et al.  Rudiments of rough sets , 2007, Inf. Sci..

[27]  Sankar K. Pal,et al.  Pattern Recognition Algorithms for Data Mining: Scalability, Knowledge Discovery, and Soft Granular Computing , 2004 .

[28]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[29]  Shiwei Tang,et al.  Efficient algorithms for incremental maintenance of closed sequential patterns in large databases , 2009, Data Knowl. Eng..

[30]  Daisuke Yamaguchi,et al.  Attribute dependency functions considering data efficiency , 2009, Int. J. Approx. Reason..

[31]  Andrzej Skowron,et al.  Spatio-Temporal Approximate Reasoning over Complex Objects , 2005, Fundam. Informaticae.

[32]  Shusaku Tsumoto,et al.  Accuracy and Coverage in Rough Set Rule Induction , 2002, Rough Sets and Current Trends in Computing.

[33]  Ken Kaneiwa,et al.  A rough set approach to multiple dataset analysis , 2011, Appl. Soft Comput..

[34]  Ken Kaneiwa,et al.  A rough set approach to mining connections from information systems , 2010, SAC '10.

[35]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[36]  Tsau Young Lin,et al.  Rough Set Methods and Applications , 2000 .

[37]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[38]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.