Incremental Discovery of Sequential Patterns Using a Backward Mining Approach

Common sequential pattern mining algorithms handle static databases. Once the data change, the previous mining result will be incorrect, and we need to restart the entire mining process for the new updated sequence database. Previous approaches, within either Apriori-based or projection-based framework, mine patterns in a forward manner. Considering the incremental characteristics of sequence-merging, we develop a novel technique, called backward mining, for efficient incremental pattern discovery. We propose an algorithm, called BSPinc, for incremental mining of sequential patterns using a backward mining strategy. Stable sequences, whose support counts remain unchanged in the updated database, are identified and eliminated from the support counting process. Candidate sequences generated using backward extensions can be mined recursively within the ever-shrinking space of the projected sequences. The experimental results show that BSPinc worked an average of 2.5 times faster than the well-known IncSpan algorithm and outperformed SPAM an average of 3 times faster.

[1]  Chia-Wen Chang,et al.  Mining Closed Sequential Patterns with Time Constraints , 2008, J. Inf. Sci. Eng..

[2]  Chia-Wen Chang,et al.  Fast discovery of sequential patterns in large databases using effective time-indexing , 2008, Inf. Sci..

[3]  Maguelonne Teisseire,et al.  Incremental mining of sequential patterns in large databases , 2003, Data Knowl. Eng..

[4]  Yue Chen,et al.  Incremental Mining of Sequential Patterns Using Prefix Tree , 2007, PAKDD.

[5]  Suh-Yin Lee,et al.  Incremental update on sequential patterns in large databases , 1998, Proceedings Tenth IEEE International Conference on Tools with Artificial Intelligence (Cat. No.98CH36294).

[6]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[7]  Ming-Syan Chen,et al.  A General Model for Sequential Pattern Mining with a Progressive Database , 2008, IEEE Transactions on Knowledge and Data Engineering.

[8]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[9]  Min Chen,et al.  Incremental mining of Web sequential patterns using PLWAP tree on tolerance MinSupport , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[10]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[11]  Hao-En Chueh,et al.  Sequential Patterns Mining with Fuzzy Time-Intervals , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[12]  Jia-Dong Ren,et al.  Mining Weighted Closed Sequential Patterns in Large Databases , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[13]  Suh-Yin Lee,et al.  Incremental update on sequential patterns in large databases by implicit merging and efficient counting , 2004, Inf. Syst..

[14]  Xiao Ma,et al.  CISpan: Comprehensive Incremental Mining Algorithms of Closed Sequential Patterns for Multi-Versional Software Mining , 2008, SDM.

[15]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[16]  David Wai-Lok Cheung,et al.  Efficient Algorithms for Mining and Incremental Update of Maximal Frequent Sequences , 2005, Data Mining and Knowledge Discovery.

[17]  Maria E. Orlowska,et al.  Improvements of IncSpan: Incremental Mining of Sequential Patterns in Large Database , 2005, PAKDD.

[18]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[19]  Yang Dong MINING SEQUENTIAL PATTERNS IN WEB LOGS , 2000 .

[20]  Mohammed J. Zaki Efficient enumeration of frequent sequences , 1998, CIKM '98.

[21]  Srinivasan Parthasarathy,et al.  Incremental and interactive sequence mining , 1999, CIKM '99.