Sequential Pattern Mining Algorithm Based on Interestingness

As an important task of data mining, sequential pattern mining is also an important and challenging task in many practical applications. Traditional sequential pattern mining algorithms usually use frequent patterns as interesting sequence patterns, but frequency is not always a good proxy for interestingness This paper proposes a sequential patterns mining algorithm based on interestingness, using the branch and bound method of OPUS to traversal all possible candidate sequences. Algorithm uses bitmap as data structure to reduce storage space and experimental results on synthetic data with known patterns and experimental results confirms the validity of proposed algorithm.

[1]  Tao Li,et al.  Skopus: Mining top-k sequential patterns under leverage , 2015, Data Mining and Knowledge Discovery.

[2]  Geoffrey I. Webb Filtered‐top‐k association discovery , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[3]  Geoffrey I. Webb,et al.  Efficient Discovery of the Most Interesting Associations , 2013, ACM Trans. Knowl. Discov. Data.

[4]  Antonio Gomariz,et al.  TKS: Efficient Mining of Top-K Sequential Patterns , 2013, ADMA.

[5]  Ding Shi-fei Research and Development of Sequential Pattern Mining(SPM) , 2009 .

[6]  Jilles Vreeken,et al.  Summarizing data succinctly with the most informative itemsets , 2012, TKDD.

[7]  Jiawei Han,et al.  TSP: mining top-K closed sequential patterns , 2003, Third IEEE International Conference on Data Mining.

[8]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[9]  Christophe G. Giraud-Carrier,et al.  Behavior-based clustering and analysis of interestingness measures for association rule mining , 2014, Data Mining and Knowledge Discovery.

[10]  Jilles Vreeken,et al.  Krimp: mining itemsets that compress , 2011, Data Mining and Knowledge Discovery.

[11]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[12]  Howard J. Hamilton,et al.  Choosing the Right Lens: Finding What is Interesting in Data Mining , 2007, Quality Measures in Data Mining.

[13]  Geoffrey I. Webb OPUS: An Efficient Admissible Algorithm for Unordered Search , 1995, J. Artif. Intell. Res..

[14]  Fabio Crestani,et al.  Ranking Sequential Patterns with Respect to Significance , 2010, PAKDD.

[15]  Cécile Low-Kam,et al.  Mining Statistically Significant Sequential Patterns , 2013, 2013 IEEE 13th International Conference on Data Mining.