On Mining Progressive Positive and Negative Sequential Patterns Simultaneously

Positive sequential pattern (PSP) mining focuses on appearing items, while negative sequential pattern (NSP) mining tends to find the relationship between occurring and nonoccurring items. There are few works involved in NSP mining, and the definitions of NSP are inconsistent in each work. The support threshold for PSP is always applied on NSP, which cannot bring out interesting patterns. In addition, PSP has been discovered on incremental databases and progressive databases, while NSP mining is only performed on static databases. Progressive sequential pattern mining finds the most up-to-date patterns, which can provide more valuable information. However, the previous progressive sequential pattern mining algorithm contains some redundant process. In this paper, we aim to find NSP on progressive databases. A new definition of NSP is given to discover more meaningful and interesting patterns. We propose an algorithm, Propone, for efficient mining process. We also propose a level-order traversal strategy and a pruning strategy to reduce the calculation time and the number of negative sequential candidates (NSC). By comparing Propone with some modified previous algorithms, the experimental results show that Propone outperforms comparative algorithms.

[1]  Wei-Min Ouyang,et al.  Mining Negative Sequential Patterns in Transaction Databases , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[2]  Maguelonne Teisseire,et al.  Incremental mining of sequential patterns in large databases , 2003, Data Knowl. Eng..

[3]  Ming-Syan Chen,et al.  A General Model for Sequential Pattern Mining with a Progressive Database , 2008, IEEE Transactions on Knowledge and Data Engineering.

[4]  Maria E. Orlowska,et al.  Improvements of IncSpan: Incremental Mining of Sequential Patterns in Large Database , 2005, PAKDD.

[5]  Weimin Ouyang,et al.  Mining Positive and Negative Sequential Patterns with Multiple Minimum Supports in Large Transaction Databases , 2010, 2010 Second WRI Global Congress on Intelligent Systems.

[6]  Soon Myoung Chung,et al.  Efficient Mining of Maximal Sequential Patterns Using Multiple Samples , 2005, SDM.

[7]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  Roque Marín,et al.  ClaSP: An Efficient Algorithm for Mining Frequent Closed Sequences , 2013, PAKDD.

[10]  Yen-Liang Chen,et al.  Mining Nonambiguous Temporal Patterns for Interval-Based Events , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Ming-Yen Lin,et al.  Mining Negative Sequential Patterns for E-commerce Recommendations , 2008, 2008 IEEE Asia-Pacific Services Computing Conference.

[12]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[13]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[14]  Vincent S. Tseng,et al.  Mining Maximal Sequential Patterns without Candidate Maintenance , 2013, ADMA.

[15]  Yanchang Zhao,et al.  An Efficient GA-Based Algorithm for Mining Negative Sequential Patterns , 2010, PAKDD.

[16]  Srinivasan Parthasarathy,et al.  Incremental and interactive sequence mining , 1999, CIKM '99.

[17]  Jen-Wei Huang,et al.  Mining frequent Time Interval-based Event with duration patterns from temporal database , 2014, 2014 International Conference on Data Science and Advanced Analytics (DSAA).

[18]  Ming-Syan Chen,et al.  Experimental results on a constraint based sequential pattern mining for telecommunication alarm data , 2001, Proceedings of the Second International Conference on Web Information Systems Engineering.

[19]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[20]  Chengqi Zhang,et al.  e-NSP: efficient negative sequential pattern mining based on identified positive patterns without database rescanning , 2011, CIKM '11.

[21]  Kyuseok Shim,et al.  SPIRIT: Sequential Pattern Mining with Regular Expression Constraints , 1999, VLDB.

[22]  Jian Pei,et al.  Debt Detection in Social Security by Sequence Classification Using Both Positive and Negative Patterns , 2009, ECML/PKDD.

[23]  Florent Masseglia,et al.  Mining sequential patterns from data streams: a centroid approach , 2006, Journal of Intelligent Information Systems.

[24]  Hung Son Nguyen,et al.  Sequential Pattern Mining from Stream Data , 2011, ADMA.

[25]  Wei-Hua Hao,et al.  Mining strong positive and negative sequential patterns , 2008 .

[26]  Wei-Hua Hao,et al.  Mining negative sequential patterns , 2007 .

[27]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[28]  Yanchang Zhao,et al.  Negative-GSP: An Efficient Method for Mining Negative Sequential Patterns , 2009, AusDM.

[29]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[30]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[31]  Klaus Berberich,et al.  Mind the gap: large-scale frequent sequence mining , 2013, SIGMOD '13.

[32]  Suh-Yin Lee,et al.  An efficient algorithm for mining time interval-based patterns in large database , 2010, CIKM.

[33]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.