Incremental mining of sequential patterns: Progress and challenges

Sequential pattern mining is a vital problem with broad applications. However, it is also challenging, as combinatorial high number of intermediate subsequences are generated that have to be critically examined. Most of the basic solutions are based on the assumption that the mining is performed on static database. But modern day databases are being continuously updated and are dynamic in nature. So, incremental mining of sequential patterns has become the norm. This article investigates the need for incremental mining of sequential patterns. An analytical study, focusing on the characteristics, has been made for more than twenty incremental mining algorithms. Further, we have discussed the issues associated with each of them. We infer that the better approach is incremental mining on the progressive database. The three more relevant algorithms, based on this approach, are also studied in depth along with the other work done in this area. This would give scope for future research direction.

[1]  Hung Jen Chen,et al.  Discover Sequential Patterns in Incremental Database , 2007 .

[2]  Qingguo Zheng,et al.  The Algorithms of Updating Sequetial Patterns , 2002, ArXiv.

[3]  Durga Toshniwal,et al.  Extracting Sequential Patterns from Progressive Databases: A Weighted Approach , 2009, 2009 International Conference on Signal Processing Systems.

[4]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[5]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Srinivasan Parthasarathy,et al.  Incremental and interactive sequence mining , 1999, CIKM '99.

[7]  Maulana Azad,et al.  A Rough Sets Partitioning Model for Incremental Mining of Sequential Patterns in Large Databases , 2010 .

[8]  Jia-Dong Ren,et al.  A New Incremental Updating Algorithm for Mining Sequential Patterns , 2006 .

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Don-Lin Yang,et al.  Fast and Effective Generation of Candidate-Sequences for Sequential Pattern Mining , 2009, 2009 Fifth International Joint Conference on INC, IMS and IDC.

[11]  Sourav S. Bhowmick,et al.  Sequential Pattern Mining: A Survey , 2003 .

[12]  An Efficient Algorithm for Incremental Mining of Sequential Patterns , 2005, ICMLC.

[13]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[14]  Suh-Yin Lee,et al.  Incremental update on sequential patterns in large databases by implicit merging and efficient counting , 2004, Inf. Syst..

[15]  Jiadong Ren,et al.  The design of frequent sequence tree in incremental mining of sequential patterns , 2011, 2011 IEEE 2nd International Conference on Software Engineering and Service Science.

[16]  Jiadong Ren,et al.  The design of storage structure for sequence in incremental sequential patterns mining , 2010, The 6th International Conference on Networked Computing and Advanced Information Management.

[17]  Tzung-Pei Hong,et al.  An Incremental FUSP-Tree Maintenance Algorithm , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[18]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[19]  Suh-Yin Lee,et al.  Interactive sequence discovery by incremental mining , 2004, Inf. Sci..

[20]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[21]  Zhuo Zhang,et al.  A New Algorithm for Mining Sequential Patterns , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[22]  Tzung-Pei Hong,et al.  An Efficient FUSP-Tree Update Algorithm for Deleted Data in Customer Sequences , 2009, 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC).

[23]  Maguelonne Teisseire,et al.  Incremental mining of sequential patterns in large databases , 2003, Data Knowl. Eng..

[24]  J. Pei,et al.  Sequential Pattern Mining by Pattern-Growth : Principles and Extensions , 2005 .

[25]  Bingru Yang,et al.  A New Algorithm for Mining Weighted Closed Sequential Pattern , 2009, 2009 Second International Symposium on Knowledge Acquisition and Modeling.

[26]  Ming-Syan Chen,et al.  A General Model for Sequential Pattern Mining with a Progressive Database , 2008, IEEE Transactions on Knowledge and Data Engineering.

[27]  Ke Wang,et al.  Incremental Discovery of Sequential Patterns , 1996 .

[28]  Pascal Poncelet,et al.  SPAMS: A Novel Incremental Approach for Sequential Pattern Mining in Data Streams , 2009, EGC.

[29]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[30]  Jia-Dong Ren,et al.  Mining Weighted Closed Sequential Patterns in Large Databases , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[31]  Suh-Yin Lee,et al.  Incremental Mining of Sequential Patterns over a Stream Sliding Window , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[32]  Xindong Wu,et al.  Sequential pattern mining in multiple streams , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[33]  Florent Masseglia,et al.  The PSP Approach for Mining Sequential Patterns , 1998, PKDD.

[34]  Xiao Ma,et al.  CISpan: Comprehensive Incremental Mining Algorithms of Closed Sequential Patterns for Multi-Versional Software Mining , 2008, SDM.

[35]  Shiwei Tang,et al.  IMCS: Incremental Mining of Closed Sequential Patterns , 2007, APWeb/WAIM.

[36]  David Wai-Lok Cheung,et al.  FFS - An I/O-Efficient Algorithm for Mining Frequent Sequences , 2001, PAKDD.

[37]  Ming-Yen Lin,et al.  Incremental Discovery of Sequential Patterns Using a Backward Mining Approach , 2009, 2009 International Conference on Computational Science and Engineering.

[38]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[39]  Tzung-Pei Hong,et al.  Incrementally fast updated sequential pattern trees , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[40]  Suh-Yin Lee,et al.  Incremental update on sequential patterns in large databases , 1998, Proceedings Tenth IEEE International Conference on Tools with Artificial Intelligence (Cat. No.98CH36294).

[41]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[42]  Tzung-Pei Hong,et al.  A new incremental data mining algorithm using pre-large itemsets , 2001, Intell. Data Anal..

[43]  Maria E. Orlowska,et al.  Improvements of IncSpan: Incremental Mining of Sequential Patterns in Large Database , 2005, PAKDD.

[44]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[45]  Ke Xu,et al.  The Algorithms of Updating Sequential Patterns , 2002, cs/0203027.

[46]  Cláudia Antunes,et al.  Sequential Pattern Mining Algorithms: Trade-offs between Speed and Memory , 2004 .

[47]  Wei Cui,et al.  Discovering interesting sequential pattern in large sequence database , 2009, 2009 Asia-Pacific Conference on Computational Intelligence and Industrial Applications (PACIIA).

[48]  Soon Myoung Chung,et al.  Efficient Mining of Maximal Sequential Patterns Using Multiple Samples , 2005, SDM.

[49]  Shih-Yang Yang,et al.  Incremental Mining of Closed Sequential Patterns in Multiple Data Streams , 2011, J. Networks.

[50]  Yue Chen,et al.  Incremental Mining of Sequential Patterns Using Prefix Tree , 2007, PAKDD.

[51]  David Wai-Lok Cheung,et al.  Efficient Algorithms for Incremental Update of Frequent Sequences , 2002, PAKDD.

[52]  Tzung-Pei Hong,et al.  An incremental mining algorithm for maintaining sequential patterns using pre-large sequences , 2011, Expert Syst. Appl..