Maintenance algorithm for high average-utility itemsets with transaction deletion

High-utility itemset mining (HUIM) is an extension of traditional association-rule mining that can find profitable itemsets for decision-making. It faces, however, a limitation since the utility of an itemset increases along with the size of it. High-average utility itemset mining (HAUIM) provides a fair measure to find the average-utility of an itemset, which is more reasonable to design the sales strategies for making the efficient decision. Traditional algorithms of HAUIM mostly focus on mining high average-utility itemsets (HAUIs) from the static database. When the database size is changed, for example, transaction insertion/deletion, the discovered information is required to be updated, thus the updated database is necessary to be re-scanned for identifying the set of HAUIs in the batch manner. In this paper, we present an updating algorithm called FUP-HAUIMD to maintain the discovered HAUIs with transaction deletion. When some transactions in the database are deleted, the designed FUP-HAUIMD algorithm can easily update the discovered HAUIs without scanning the database all the time. The designed FUP-HAUIMD algorithm divides the itemsets into four cases based on the modified fast updated (MFUP) concept. The average-utility (AU)-list structure is further utilized to keep the necessary ramification for later mining progress. Experiments are then conducted to compare the designed FUP-HAUIMD algorithm with the state-of-the-art baseline algorithm running on the batch mode, and the developed approach shows better performance in terms of runtime, number of examined patterns, and scalability.

[1]  Tzung-Pei Hong,et al.  An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[2]  Philip S. Yu,et al.  Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[3]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[4]  Antonio Gomariz,et al.  The SPMF Open-Source Data Mining Library Version 2 , 2016, ECML/PKDD.

[5]  Fournier-VigerPhilippe,et al.  An efficient algorithm to mine high average-utility itemsets , 2016 .

[6]  Raj P. Gopalan,et al.  Efficient Mining of High Utility Itemsets from Large Datasets , 2008, PAKDD.

[7]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[8]  Benjamin C. M. Fung,et al.  Direct Discovery of High Utility Itemsets without Candidate Generation , 2012, 2012 IEEE 12th International Conference on Data Mining.

[9]  Tzung-Pei Hong,et al.  Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[10]  Salvatore Orlando,et al.  Fast and memory efficient mining of frequent closed itemsets , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12]  Zhi-Hong Deng,et al.  Fast mining frequent itemsets using Nodesets , 2014, Expert Syst. Appl..

[13]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[14]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[15]  Chad Creighton,et al.  Mining gene expression databases for association rules , 2003, Bioinform..

[16]  Yue-Shi Lee,et al.  Mining High Utility Quantitative Association Rules , 2007, DaWaK.

[17]  Tzung-Pei Hong,et al.  An efficient algorithm to mine high average-utility itemsets , 2016, Adv. Eng. Informatics.

[18]  Tzung-Pei Hong,et al.  Incrementally fast updated frequent pattern trees , 2008, Expert Syst. Appl..

[19]  Tzung-Pei Hong,et al.  The Pre-FUFP algorithm for incremental mining , 2009, Expert Syst. Appl..

[20]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[21]  Benjamin C. M. Fung,et al.  Mining High Utility Patterns in One Phase without Generating Candidates , 2016, IEEE Transactions on Knowledge and Data Engineering.

[22]  Tzung-Pei Hong,et al.  Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases , 2015, Adv. Eng. Informatics.

[23]  Jerry Chun-Wei Lin,et al.  EHAUPM: Efficient High Average-Utility Pattern Mining With Tighter Upper Bounds , 2017, IEEE Access.

[24]  A. Choudhary,et al.  A fast high utility itemsets mining algorithm , 2005, UBDM '05.

[25]  Tzung-Pei Hong,et al.  A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[26]  David Wai-Lok Cheung,et al.  A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[27]  Tzung-Pei Hong,et al.  Efficiently Mining High Average-Utility Itemsets with an Improved Upper-Bound Strategy , 2012, Int. J. Inf. Technol. Decis. Mak..

[28]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[29]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[30]  Tzung-Pei Hong,et al.  Efficiently Mining High Average Utility Itemsets with a Tree Structure , 2010, ACIIDS.

[31]  Tzung-Pei Hong,et al.  Maintenance of fast updated frequent pattern trees for record deletion , 2009, Comput. Stat. Data Anal..

[32]  Tzung-Pei Hong,et al.  An incremental mining algorithm for high utility itemsets , 2012, Expert Syst. Appl..