Discovering Important Sequential Patterns with Length-Decreasing Weighted Support Constraints

Sequential pattern mining with constraints has been developed to improve the efficiency and effectiveness in mining process. Specifically, there are two interesting constraints for sequential pattern mining. First, some sequences are more important and others are less important. Weight constraints consider the importance of sequences and items within sequences. Second, patterns including only a few items are interesting if they have high support. Meanwhile, long patterns can be interesting although their supports are relatively small. Weight constraints and length-decreasing support constraints are two paradigms aimed at finding important sequential patterns and reducing uninteresting patterns. Although weight and length-decreasing support constraints are vital elements, it is hard to consider both constraints by using previous approaches. In this paper, we integrate weight and length-decreasing support constraints by pushing two constraints into the prefix projection growth method. For pruning techniques, we define the Weighted Smallest Valid Extension property and apply the property to our pruning methods for reducing search space. In performance test, we show that our algorithm mines important sequential patterns with length-decreasing support constraints.

[1]  Jeffrey Xu Yu,et al.  Scalable sequential pattern mining for biological sequences , 2004, CIKM '04.

[2]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[3]  Kyuseok Shim,et al.  SPIRIT: Sequential Pattern Mining with Regular Expression Constraints , 1999, VLDB.

[4]  Unil Yun,et al.  Efficient mining of weighted interesting patterns with a strong weight and/or support affinity , 2007, Inf. Sci..

[5]  Hsiao-Fan Wang,et al.  A Pruning Approach to Pattern Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[6]  Chun-Chi Chen,et al.  Mining interval sequential patterns , 2005 .

[7]  Jian Pei,et al.  Mining sequential patterns with constraints in large databases , 2002, CIKM '02.

[8]  Zhengxin Chen,et al.  A Descriptive Framework for the Field of Data Mining and Knowledge Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[9]  Young-Koo Lee,et al.  Handling Dynamic Weights in Weighted Frequent Pattern Mining , 2008, IEICE Trans. Inf. Syst..

[10]  Unil Yun,et al.  A new framework for detecting weighted sequential patterns in large sequence databases , 2008, Knowl. Based Syst..

[11]  Qingyu Zhang,et al.  Web Mining: a Survey of Current Research, Techniques, and Software , 2008, Int. J. Inf. Technol. Decis. Mak..

[12]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[13]  Jiawei Han,et al.  SeqIndex: Indexing Sequences by Sequential Pattern Analysis , 2005, SDM.

[14]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[15]  Jiawei Han,et al.  TSP: Mining top-k closed sequential patterns , 2004, Knowledge and Information Systems.

[16]  Unil Yun Analyzing Sequential Patterns in Retail Databases , 2007, Journal of Computer Science and Technology.

[17]  Unil Yun,et al.  Mining lossless closed frequent patterns with weight constraints , 2007, Knowl. Based Syst..

[18]  Chengqi Zhang,et al.  Activity Mining: from Activities to Actions , 2008, Int. J. Inf. Technol. Decis. Mak..

[19]  Xindong Wu,et al.  10 Challenging Problems in Data Mining Research , 2006, Int. J. Inf. Technol. Decis. Mak..

[20]  Jianyong Wang,et al.  BAMBOO: Accelerating Closed Itemset Mining by Deeply Pushing the Length-Decreasing Support Constraint , 2004, SDM.

[21]  Jian Pei,et al.  ApproxMAP: Approximate Mining of Consensus Sequential Patterns , 2003, SDM.

[22]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[23]  Jian Pei,et al.  Constraint-based sequential pattern mining: the pattern-growth methods , 2007, Journal of Intelligent Information Systems.

[24]  George Karypis,et al.  Finding Frequent Patterns Using Length-Decreasing Support Constraints , 2005, Data Mining and Knowledge Discovery.

[25]  Unil Yun,et al.  On pushing weight constraints deeply into frequent itemset mining , 2009, Intell. Data Anal..

[26]  Unil Yun,et al.  WSpan: Weighted Sequential pattern mining in large sequence databases , 2006, 2006 3rd International IEEE Conference Intelligent Systems.

[27]  Jian Yin,et al.  The Incremental Mining of Constrained Cube Gradients , 2007, Int. J. Inf. Technol. Decis. Mak..