Weighted Contiguous Sequential Pattern Mining

In real-life, big data is contiguous, such as traffic flow and network flow, and thus some contiguous mining algorithms have been developed. It has been noticed that the significance of data (e.g., DNA sequences) is often different, and the real data may have various weights. However, the continuity of the mined data is not fully considered in the existing weighted mining algorithms. In this study, we are the first to formulate the problem of mining weighted contiguous sequential patterns and propose a new algorithm named WCSpan. Based on the usage of modified prefix pattern expansion and a tight weighted upper-bound model, we proved that WCSpan can efficiently mine the weighted contiguous sequential patterns. Experimental results showed that compared with existing similar algorithms, the proposed algorithm has advantages in execution time and memory usage. Besides, the integrity of the outcome patterns of WCSpan is preserved while data omission is avoided. In addition, the generation of patterns by the WCSpan method is faster than other methods, where the weighted upper-bound model can prune redundant candidates precisely to save memory. Both of them significantly improve the performance of WCSpan.

[1]  V. E. Adeyemo,et al.  LCCspm: l-Length Closed Contiguous Sequential Patterns Mining Algorithm to Find Frequent Athlete Movement Patterns from GPS , 2021, 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA).

[2]  Chowdhury Farhan Ahmed,et al.  Mining Weighted Sequential Patterns in Incremental Uncertain Databases , 2021, Information Sciences.

[3]  Philip S. Yu,et al.  On-Shelf Utility Mining of Sequence Data , 2020, ACM Trans. Knowl. Discov. Data.

[4]  Philip S. Yu,et al.  TKUS: Mining Top-K High-Utility Sequential Patterns , 2020, Inf. Sci..

[5]  Philip S. Yu,et al.  Fast Utility Mining on Sequence Data , 2020, IEEE Transactions on Cybernetics.

[6]  Ickjai Lee,et al.  Mining distinct and contiguous sequential patterns from large vehicle trajectories , 2020, Knowl. Based Syst..

[7]  Armelle Brun,et al.  C3Ro: An efficient mining algorithm of extended-closed contiguous robust sequential patterns in noisy data , 2019, Expert Syst. Appl..

[8]  Chowdhury Farhan Ahmed,et al.  An Efficient Approach for Mining Weighted Sequential Patterns in Dynamic Databases , 2018, ICDM.

[9]  Can Yang,et al.  Mining and visual exploration of closed contiguous sequential patterns in trajectories , 2018, Int. J. Geogr. Inf. Sci..

[10]  Justin Zhijun Zhan,et al.  Exploiting highly qualified pattern with frequency and weight occupancy , 2018, Knowledge and Information Systems.

[11]  Philip S. Yu,et al.  A Survey of Utility-Oriented Pattern Mining , 2018, IEEE Transactions on Knowledge and Data Engineering.

[12]  Philip S. Yu,et al.  A Survey of Parallel Sequential Pattern Mining , 2018, ACM Trans. Knowl. Discov. Data.

[13]  Tzung-Pei Hong,et al.  Mining Weighted Frequent Itemsets without Candidate Generation in Uncertain Databases , 2017, Int. J. Inf. Technol. Decis. Mak..

[14]  Justin Zhijun Zhan,et al.  Data mining in distributed environment: a survey , 2017, WIREs Data Mining Knowl. Discov..

[15]  Armelle Brun,et al.  CCPM: A Scalable and Noise-Resistant Closed Contiguous Sequential Patterns Mining Algorithm , 2017, MLDM.

[16]  Justin Zhijun Zhan,et al.  Extracting recent weighted-based patterns from uncertain temporal databases , 2017, Eng. Appl. Artif. Intell..

[17]  Justin Zhijun Zhan,et al.  Mining of frequent patterns with multiple minimum supports , 2017, Eng. Appl. Artif. Intell..

[18]  Yinglin Wang,et al.  CCSpan: Mining closed contiguous sequential patterns , 2015, Knowl. Based Syst..

[19]  Unil Yun,et al.  Efficient Mining of Robust Closed Weighted Sequential Patterns Without Information Loss , 2015, Int. J. Artif. Intell. Tools.

[20]  T. Hong,et al.  An efficient approach for finding weighted sequential patterns from sequence databases , 2014, Applied Intelligence.

[21]  Jinlin Chen,et al.  Mining contiguous sequential patterns from web logs , 2007, WWW '07.

[22]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[23]  Fionn Murtagh,et al.  Weighted Association Rule Mining using weighted support and significance framework , 2003, KDD '03.

[24]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[25]  Myung-Sup Kim,et al.  Protocol Specification Extraction Based on Contiguous Sequential Pattern Algorithm , 2019, IEEE Access.

[26]  Yun Sing Koh,et al.  A Survey of Sequential Pattern Mining , 2017 .

[27]  V. Tseng,et al.  Weighted frequent itemset mining over uncertain databases , 2015, Applied Intelligence.

[28]  Tzung-Pei Hong,et al.  Incrementally updating the discovered sequential patterns based on pre-large concept , 2015, Intell. Data Anal..

[29]  J. Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[30]  John J. Leggett,et al.  WFIM: Weighted Frequent Itemset Mining with a weight range and a minimum weight , 2005, SDM.

[31]  R. Agrawal,et al.  Fast Algorithms for Mining Association Rules , 1998 .

[32]  D. Ramkumar,et al.  Weighted Association Rules: Model and Algorithm , 1998 .

[33]  Setsuo Ohsuga,et al.  INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .