An efficient algorithm for mining periodic high-utility sequential patterns

A periodic high-utility sequential pattern (PHUSP) is a pattern that not only yields a high-utility (e.g. high profit) but also appears regularly in a sequence database. Finding PHUSPs is useful for several applications such as market basket analysis, where it can reveal recurring and profitable customer behavior. Although discovering PHUSPs is desirable, it is computationally difficult. To discover PHUSPs efficiently, this paper proposes a structure for periodic high-utility sequential pattern mining (PHUSPM) named PUSP. Furthermore, to reduce the search space and speed up PHUSPM, a pruning strategy is developed. This results in an efficient algorithm called periodic high-utility sequential pattern optimal miner (PUSOM). An experimental evaluation was performed on both synthetic and real-life datasets to compare the performance of PUSOM with state-of-the-art PHUSPM algorithms in terms of execution time, memory usage and scalability. Experimental results show that the PUSOM algorithm can efficiently discover the complete set of PHUSPs. Moreover, it outperforms the other four algorithms as the former can prune many unpromising patterns using its designed structure and pruning strategy.

[1]  Antonio Gomariz,et al.  SPMF: a Java open-source pattern mining library , 2014, J. Mach. Learn. Res..

[2]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[3]  Yi-Cheng Chen,et al.  On efficiently mining high utility sequential patterns , 2016, Knowledge and Information Systems.

[4]  Longbing Cao,et al.  USpan: an efficient algorithm for mining high utility sequential patterns , 2012, KDD.

[5]  Judy Kay,et al.  Clustering and Sequential Pattern Mining of Online Collaborative Learning Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[6]  Lei Liu,et al.  Undersampled Hyperspectral Image Reconstruction Based on Surfacelet Transform , 2015, J. Sensors.

[7]  Hoai Bac Le,et al.  MHHUSP: An integrated algorithm for mining and Hiding High Utility Sequential Patterns , 2016, 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE).

[8]  Hamido Fujita,et al.  An efficient algorithm for mining high utility patterns from incremental databases with one database scan , 2017, Knowl. Based Syst..

[9]  Van-Nam Huynh,et al.  An efficient algorithm for Hiding High Utility Sequential Patterns , 2018, Int. J. Approx. Reason..

[10]  Philip S. Yu,et al.  A Survey of Utility-Oriented Pattern Mining , 2018, IEEE Transactions on Knowledge and Data Engineering.

[11]  Hoai Bac Le,et al.  A pure array structure and parallel strategy for high-utility sequential pattern mining , 2018, Expert Syst. Appl..

[12]  Adam Wright,et al.  The use of sequential pattern mining to predict next prescribed medications , 2015, J. Biomed. Informatics.

[13]  Hoai Bac Le,et al.  A Novel Approach for Hiding High Utility Sequential Patterns , 2015, SoICT.

[14]  Yun Sing Koh,et al.  A Survey of Sequential Pattern Mining , 2017 .

[15]  Dmitriy Fradkin,et al.  Under Consideration for Publication in Knowledge and Information Systems Mining Sequential Patterns for Classification , 2022 .

[16]  Byeong-Soo Jeong,et al.  A Novel Approach for Mining High‐Utility Sequential Patterns in Sequence Databases , 2010 .

[17]  Philippe Fournier-Viger,et al.  Mining Correlated High-Utility Itemsets Using the Bond Measure , 2016, HAIS.

[18]  Tzung-Pei Hong,et al.  Applying the maximum utility measure in high utility sequential pattern mining , 2014, Expert Syst. Appl..

[19]  Philippe Lenca,et al.  Mining Top-K Periodic-Frequent Pattern from Transactional Databases without Support Threshold , 2009, IAIT.

[20]  Young-Koo Lee,et al.  Discovering Periodic-Frequent Patterns in Transactional Databases , 2009, PAKDD.

[21]  Vincent S. Tseng,et al.  A One-Phase Method for Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environments , 2012, IEA/AIE.

[22]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[23]  Masaru Kitsuregawa,et al.  Efficient discovery of periodic-frequent patterns in very large databases , 2016, J. Syst. Softw..

[24]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[25]  Tzung-Pei Hong,et al.  A two-phase approach to mine short-period high-utility itemsets in transactional databases , 2017, Adv. Eng. Informatics.

[26]  Feng Jiang,et al.  An Asynchronous Periodic Sequential Pattern Mining Algorithm with Multiple Minimum Item Supports for Ad Hoc Networking , 2015, J. Sensors.

[27]  Hamido Fujita,et al.  A survey of incremental high‐utility itemset mining , 2018, WIREs Data Mining Knowl. Discov..

[28]  Philippe Fournier-Viger,et al.  PHM: Mining Periodic High-Utility Itemsets , 2016, ICDM.

[29]  Longbing Cao,et al.  Efficiently Mining Top-K High Utility Sequential Patterns , 2013, 2013 IEEE 13th International Conference on Data Mining.

[30]  Hoai Bac Le,et al.  An Approach to Decrease Execution Time and Difference for Hiding High Utility Sequential Patterns , 2016, IUKM.

[31]  Hamido Fujita,et al.  Damped window based high average utility pattern mining over data streams , 2017, Knowl. Based Syst..

[32]  Aijun An,et al.  Memory-adaptive high utility sequential pattern mining over data streams , 2017, Machine Learning.

[33]  Masaru Kitsuregawa,et al.  Discovering partial periodic-frequent patterns in a transactional database , 2017, J. Syst. Softw..

[34]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[35]  Antonio Gomariz,et al.  The SPMF Open-Source Data Mining Library Version 2 , 2016, ECML/PKDD.

[36]  Jerry Chun-Wei Lin,et al.  Mining correlated high-utility itemsets using various measures , 2020, Log. J. IGPL.

[37]  Philip S. Yu,et al.  A Survey of Parallel Sequential Pattern Mining , 2018, ACM Trans. Knowl. Discov. Data.

[38]  Philippe Fournier-Viger,et al.  Efficient mining of short periodic high-utility itemsets , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[39]  Van-Nam Huynh,et al.  Mining Periodic High Utility Sequential Patterns , 2017, ACIIDS.

[40]  Philippe Fournier-Viger,et al.  High-Utility Sequential Pattern Mining with Multiple Minimum Utility Thresholds , 2017, APWeb/WAIM.

[41]  P. Krishna Reddy,et al.  An Efficient Approach to Mine Periodic-Frequent Patterns in Transactional Databases , 2011, PAKDD Workshops.

[42]  Jiadong Ren,et al.  Mining sequential patterns with periodic wildcard gaps , 2014, Applied Intelligence.