Efficient approach for incremental high utility pattern mining with indexed list structure

Abstract Since traditional frequent pattern mining approaches assume that all the items in binary databases have the same importance regardless of their own features, they have difficulty in satisfying requirements of real world applications such as finding patterns with high profits. High utility pattern mining was proposed to deal with such an issue, and various relevant works have been researched. There have been demands for efficient solutions to find interesting knowledge from specific environments in which data accumulates continuously with the passage of time such as social network service, wireless network sensor data, etc. Although several algorithms have been devised to mine high utility patterns from incremental databases, they still have performance limitations in the process of generating a large number of candidate patterns and identifying actually useful results from the found candidates. In order to solve the problems, we propose a new algorithm for mining high utility patterns from incremental databases. The newly proposed data structures in a list form and mining techniques allow our approach to extract high utility patterns without generating any candidates. In addition, we suggest restructuring and pruning techniques that can process incremental data more efficiently. Experimental results on various real and synthetic datasets demonstrate that the proposed algorithm outperforms state-of-the-art methods in terms of runtime, memory, and scalability.

[1]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[2]  Ning Zhang,et al.  Probabilistic frequent itemset mining over uncertain data streams , 2018, Expert Syst. Appl..

[3]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[4]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[5]  Tzung-Pei Hong,et al.  Incrementally updating the discovered sequential patterns based on pre-large concept , 2015, Intell. Data Anal..

[6]  Chin-Chen Chang,et al.  Isolated items discarding strategy for discovering high utility itemsets , 2008, Data Knowl. Eng..

[7]  Tzung-Pei Hong,et al.  FDHUP: Fast algorithm for mining discriminative high utility patterns , 2017, Knowledge and Information Systems.

[8]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[9]  Heungmo Ryang,et al.  High utility pattern mining over data streams with sliding window technique , 2016, Expert Syst. Appl..

[10]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[11]  Tzung-Pei Hong,et al.  An incremental mining algorithm for high utility itemsets , 2012, Expert Syst. Appl..

[12]  Bay Vo,et al.  An efficient and effective algorithm for mining top-rank-k frequent patterns , 2015, Expert Syst. Appl..

[13]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[14]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[15]  Tzung-Pei Hong,et al.  An efficient projection-based indexing approach for mining high utility itemsets , 2012, Knowledge and Information Systems.

[16]  Keun Ho Ryu,et al.  High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates , 2014, Expert Syst. Appl..

[17]  Lin Feng,et al.  UT-Tree: Efficient mining of high utility itemsets from data streams , 2013, Intell. Data Anal..

[18]  Longbing Cao,et al.  Efficiently Mining Top-K High Utility Sequential Patterns , 2013, 2013 IEEE 13th International Conference on Data Mining.

[19]  Unil Yun,et al.  Single-pass based efficient erasable pattern mining using list data structure on dynamic incremental databases , 2018, Future Gener. Comput. Syst..

[20]  Heungmo Ryang,et al.  Approximate Maximal Frequent Pattern Mining with Weight Conditions and Error Tolerance , 2016, Int. J. Pattern Recognit. Artif. Intell..

[21]  Yu Liu,et al.  Mining high utility itemsets by dynamically pruning the tree structure , 2013, Applied Intelligence.

[22]  Philip S. Yu,et al.  Mining high utility episodes in complex event sequences , 2013, KDD.

[23]  Benjamin C. M. Fung,et al.  Direct Discovery of High Utility Itemsets without Candidate Generation , 2012, 2012 IEEE 12th International Conference on Data Mining.

[24]  Tzung-Pei Hong,et al.  An incremental mining algorithm for erasable itemsets , 2017, 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA).

[25]  Kristian Sabo,et al.  An approach to cluster separability in a partition , 2015, Inf. Sci..

[26]  Ho-Jin Choi,et al.  Single-pass incremental and interactive mining for weighted frequent patterns , 2012, Expert Syst. Appl..

[27]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[28]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[29]  Vincent S. Tseng,et al.  EFIM-Closed: Fast and Memory Efficient Discovery of Closed High-Utility Itemsets , 2016, MLDM.

[30]  Heungmo Ryang,et al.  Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques , 2017, Knowledge and Information Systems.

[31]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[32]  Tzung-Pei Hong,et al.  Applying the maximum utility measure in high utility sequential pattern mining , 2014, Expert Syst. Appl..

[33]  Aijun An,et al.  Mining top-k high utility patterns over data streams , 2014, Inf. Sci..

[34]  Unil Yun,et al.  A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives , 2017, Future Gener. Comput. Syst..

[35]  Tzung-Pei Hong,et al.  Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases , 2015, Adv. Eng. Informatics.

[36]  Luigi Troiano,et al.  Mining frequent itemsets in data streams within a time horizon , 2014, Data Knowl. Eng..

[37]  Ho-Jin Choi,et al.  Interactive mining of high utility patterns over data streams , 2012, Expert Syst. Appl..

[38]  Philip S. Yu,et al.  Efficient Algorithms for Mining the Concise and Lossless Representation of High Utility Itemsets , 2015, IEEE Transactions on Knowledge and Data Engineering.

[39]  Hamido Fujita,et al.  An efficient algorithm for mining high utility patterns from incremental databases with one database scan , 2017, Knowl. Based Syst..

[40]  Mohammad Mehedi Hassan,et al.  Mining of productive periodic-frequent patterns for IoT data analytics , 2018, Future Gener. Comput. Syst..

[41]  Peng Zhao,et al.  Mining frequent itemsets over uncertain data streams , 2018, Int. J. High Perform. Comput. Netw..

[42]  Hamido Fujita,et al.  A survey of incremental high‐utility itemset mining , 2018, WIREs Data Mining Knowl. Discov..

[43]  Ahmad Almogren,et al.  Scalable regular pattern mining in evolving body sensor data , 2017, Future Gener. Comput. Syst..

[44]  André Ricardo Backes,et al.  Shape classification using line segment statistics , 2015, Inf. Sci..

[45]  Unil Yun,et al.  Mining high utility itemsets based on the time decaying model , 2016, Intell. Data Anal..

[46]  Keun Ho Ryu,et al.  Mining Frequent Weighted Itemsets without Storing Transaction IDs and Generating Candidates , 2017, Int. J. Uncertain. Fuzziness Knowl. Based Syst..