Efficiently updating the discovered high average-utility itemsets with transaction insertion

High-utility itemset mining (HUIM) is an extension of frequent-itemset mining (FIM) but considers the unit profit and quantity of items to discover the set of high-utility itemsets (HUIs). Traditionally, the utility of an itemset is the summation of the utilities of the itemset in all the transactions regardless of its length. This approach is, however, inappropriate in real-world applications since the utility of the itemset increases along with the number of items within it. High average-utility itemset mining (HAUIM) was designed to provide more reasonable utility measure by taking the size of the itemset into account. Existing algorithms can only handle, however, the static database and unsuitable for the dynamic environment since the size of data is frequently changed in real-life situations. In this paper, an incremental high-average utility pattern mining (IHAUPM) algorithm is presented to handle the incremental database with transaction insertion. The well-known fast updated (FUP) concept in the FIM is modified to adopt the designed algorithm, thus efficiently updating the discovered HAUIs. Based on the designed model for HAUIM with transaction insertion, the proposed IHAUPM algorithm can easily only handle the inserted transactions. Experiments are carried on six datasets and the results showed that the designed algorithm has better performance than the state-of-the-art algorithms performing in the batch manner.

[1]  Justin Zhijun Zhan,et al.  Mining of High-Utility Itemsets by ACO Algorithm , 2016, MISNC, SI, DS 2016.

[2]  Heungmo Ryang,et al.  Incremental high utility pattern mining with static and dynamic databases , 2014, Applied Intelligence.

[3]  Tzung-Pei Hong,et al.  Efficient algorithms for mining high-utility itemsets in uncertain databases , 2016, Knowl. Based Syst..

[4]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[5]  Philippe Fournier-Viger,et al.  FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits , 2016, Knowl. Based Syst..

[6]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[7]  A. Choudhary,et al.  A fast high utility itemsets mining algorithm , 2005, UBDM '05.

[8]  Jerry Chun-Wei Lin,et al.  EHAUPM: Efficient High Average-Utility Pattern Mining With Tighter Upper Bounds , 2017, IEEE Access.

[9]  Vincent S. Tseng,et al.  An efficient algorithm for mining high utility itemsets with negative item values in large databases , 2009, Appl. Math. Comput..

[10]  Tzung-Pei Hong,et al.  A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[11]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[12]  Tzung-Pei Hong,et al.  An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[13]  Lu Yang,et al.  Mining high-utility itemsets based on particle swarm optimization , 2016, Eng. Appl. Artif. Intell..

[14]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[15]  Tzung-Pei Hong,et al.  Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[16]  Chin-Chen Chang,et al.  A New FP-Tree Algorithm for Mining Frequent Itemsets , 2004, AWCC.

[17]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[18]  Justin Zhijun Zhan,et al.  Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining , 2016, Eng. Appl. Artif. Intell..

[19]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[20]  Tzung-Pei Hong,et al.  Efficiently Mining High Average-Utility Itemsets with an Improved Upper-Bound Strategy , 2012, Int. J. Inf. Technol. Decis. Mak..

[21]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[22]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[23]  Howard J. Hamilton,et al.  A Unified Framework for Utility Based Measures for Mining Itemsets , 2006 .

[24]  Philip S. Yu,et al.  Efficient Algorithms for Mining Top-K High Utility Itemsets , 2016, IEEE Transactions on Knowledge and Data Engineering.

[25]  Tzung-Pei Hong,et al.  Incrementally fast updated frequent pattern trees , 2008, Expert Syst. Appl..

[26]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[27]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[28]  Unil Yun,et al.  Mining high utility itemsets based on the time decaying model , 2016, Intell. Data Anal..

[29]  Qiang Yang,et al.  Mining high utility itemsets , 2003, Third IEEE International Conference on Data Mining.

[30]  Philippe Fournier-Viger,et al.  An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies , 2016, Knowl. Based Syst..

[31]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[32]  Tzung-Pei Hong,et al.  Mining high-utility itemsets with various discount strategies , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[33]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[34]  Srikumar Krishnamoorthy,et al.  Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[35]  Tzung-Pei Hong,et al.  An incremental mining algorithm for high utility itemsets , 2012, Expert Syst. Appl..

[36]  Benjamin C. M. Fung,et al.  Mining High Utility Patterns in One Phase without Generating Candidates , 2016, IEEE Transactions on Knowledge and Data Engineering.

[37]  Tzung-Pei Hong,et al.  An efficient algorithm to mine high average-utility itemsets , 2016, Adv. Eng. Informatics.

[38]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[39]  Philippe Fournier-Viger,et al.  A fast algorithm for mining high average-utility itemsets , 2017, Applied Intelligence.

[40]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[41]  Tzung-Pei Hong,et al.  Efficient mining of high-utility itemsets using multiple minimum utility thresholds , 2016, Knowl. Based Syst..

[42]  Chin-Chen Chang,et al.  Perfect Hashing Schemes for Mining Association Rules , 2005, Comput. J..