论文信息 - Maintenance algorithm for high average-utility itemsets with transaction deletion

Maintenance algorithm for high average-utility itemsets with transaction deletion

High-utility itemset mining (HUIM) is an extension of traditional association-rule mining that can find profitable itemsets for decision-making. It faces, however, a limitation since the utility of an itemset increases along with the size of it. High-average utility itemset mining (HAUIM) provides a fair measure to find the average-utility of an itemset, which is more reasonable to design the sales strategies for making the efficient decision. Traditional algorithms of HAUIM mostly focus on mining high average-utility itemsets (HAUIs) from the static database. When the database size is changed, for example, transaction insertion/deletion, the discovered information is required to be updated, thus the updated database is necessary to be re-scanned for identifying the set of HAUIs in the batch manner. In this paper, we present an updating algorithm called FUP-HAUIMD to maintain the discovered HAUIs with transaction deletion. When some transactions in the database are deleted, the designed FUP-HAUIMD algorithm can easily update the discovered HAUIs without scanning the database all the time. The designed FUP-HAUIMD algorithm divides the itemsets into four cases based on the modified fast updated (MFUP) concept. The average-utility (AU)-list structure is further utilized to keep the necessary ramification for later mining progress. Experiments are then conducted to compare the designed FUP-HAUIMD algorithm with the state-of-the-art baseline algorithm running on the batch mode, and the developed approach shows better performance in terms of runtime, number of examined patterns, and scalability.

Philippe Fournier-Viger | Chun-Wei Lin | Youcef Djenouri | Xiangmin Guo | Yina Shao

[1] Tzung-Pei Hong,et al. An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[2] Philip S. Yu,et al. Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[3] Cory J. Butz,et al. A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[4] Antonio Gomariz,et al. The SPMF Open-Source Data Mining Library Version 2 , 2016, ECML/PKDD.

[5] Fournier-VigerPhilippe,et al. An efficient algorithm to mine high average-utility itemsets , 2016 .

[6] Raj P. Gopalan,et al. Efficient Mining of High Utility Itemsets from Large Datasets , 2008, PAKDD.

[7] Young-Koo Lee,et al. Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[8] Benjamin C. M. Fung,et al. Direct Discovery of High Utility Itemsets without Candidate Generation , 2012, 2012 IEEE 12th International Conference on Data Mining.

[9] Tzung-Pei Hong,et al. Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[10] Salvatore Orlando,et al. Fast and memory efficient mining of frequent closed itemsets , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11] Philip S. Yu,et al. Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12] Zhi-Hong Deng,et al. Fast mining frequent itemsets using Nodesets , 2014, Expert Syst. Appl..

[13] Jian Pei,et al. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[14] Ying Liu,et al. A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[15] Chad Creighton,et al. Mining gene expression databases for association rules , 2003, Bioinform..

[16] Yue-Shi Lee,et al. Mining High Utility Quantitative Association Rules , 2007, DaWaK.

[17] Tzung-Pei Hong,et al. An efficient algorithm to mine high average-utility itemsets , 2016, Adv. Eng. Informatics.

[18] Tzung-Pei Hong,et al. Incrementally fast updated frequent pattern trees , 2008, Expert Syst. Appl..

[19] Tzung-Pei Hong,et al. The Pre-FUFP algorithm for incremental mining , 2009, Expert Syst. Appl..

[20] Mengchi Liu,et al. Mining high utility itemsets without candidate generation , 2012, CIKM.

[21] Benjamin C. M. Fung,et al. Mining High Utility Patterns in One Phase without Generating Candidates , 2016, IEEE Transactions on Knowledge and Data Engineering.

[22] Tzung-Pei Hong,et al. Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases , 2015, Adv. Eng. Informatics.

[23] Jerry Chun-Wei Lin,et al. EHAUPM: Efficient High Average-Utility Pattern Mining With Tighter Upper Bounds , 2017, IEEE Access.

[24] A. Choudhary,et al. A fast high utility itemsets mining algorithm , 2005, UBDM '05.

[25] Tzung-Pei Hong,et al. A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[26] David Wai-Lok Cheung,et al. A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[27] Tzung-Pei Hong,et al. Efficiently Mining High Average-Utility Itemsets with an Improved Upper-Bound Strategy , 2012, Int. J. Inf. Technol. Decis. Mak..

[28] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[29] Jiawei Han,et al. Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[30] Tzung-Pei Hong,et al. Efficiently Mining High Average Utility Itemsets with a Tree Structure , 2010, ACIIDS.

[31] Tzung-Pei Hong,et al. Maintenance of fast updated frequent pattern trees for record deletion , 2009, Comput. Stat. Data Anal..

[32] Tzung-Pei Hong,et al. An incremental mining algorithm for high utility itemsets , 2012, Expert Syst. Appl..