An Efficient Tree-Based Algorithm for Mining High Average-Utility Itemset

High-utility itemset mining (HUIM), which is an extension of well-known frequent itemset mining (FIM), has become a key topic in recent years. HUIM aims to find a complete set of itemsets having high utilities in a given dataset. High average-utility itemset mining (HAUIM) is a variation of traditional HUIM. HAUIM provides an alternative measurement named the average-utility to discover the itemsets by taking into consideration both of the utility values and lengths of itemsets. HAUIM is important for several application domains, such as, business applications, medical data analysis, mobile commerce, streaming data analysis, etc. In the literature, several algorithms have been proposed by introducing their own upper-bound models and data structures to discover high average utility itemsets (HAUIs) in a given database. However, they require long execution times and large memory consumption to handle the problem. To overcome these limitations, this paper, first, introduces four novel upper-bounds along with pruning strategies and two data structures. Then, it proposes a pattern growth approach called the HAUL-Growth algorithm for efficiently mining of HAUIs using the proposed upper-bounds and data structures. Experimental results show that the proposed HAUL-Growth algorithm significantly outperforms the state-of-the-art dHAUIM and TUB-HAUIM algorithms in terms of execution times, number of join operations, memory consumption, and scalability.

[1]  Tzung-Pei Hong,et al.  Efficiently Mining High Average Utility Itemsets with a Tree Structure , 2010, ACIIDS.

[2]  Unil Yun,et al.  Mining of high average-utility itemsets using novel list structure and pruning strategy , 2017, Future Gener. Comput. Syst..

[3]  Srikumar Krishnamoorthy,et al.  Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[4]  Philippe Fournier-Viger,et al.  A fast algorithm for mining high average-utility itemsets , 2017, Applied Intelligence.

[5]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[6]  Heungmo Ryang,et al.  High utility pattern mining over data streams with sliding window technique , 2016, Expert Syst. Appl..

[7]  Tzung-Pei Hong,et al.  Effective utility mining with the measure of average utility , 2011, Expert Syst. Appl..

[8]  Heungmo Ryang,et al.  Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques , 2017, Knowledge and Information Systems.

[9]  Tzung-Pei Hong,et al.  An efficient algorithm to mine high average-utility itemsets , 2016, Adv. Eng. Informatics.

[10]  Zhi-Hong Deng,et al.  DiffNodesets: An efficient structure for fast mining frequent itemsets , 2015, Appl. Soft Comput..

[11]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12]  Aijun An,et al.  Mining significant high utility gene regulation sequential patterns , 2017, BMC Systems Biology.

[13]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[14]  Tzung-Pei Hong,et al.  Efficiently Mining High Average-Utility Itemsets with an Improved Upper-Bound Strategy , 2012, Int. J. Inf. Technol. Decis. Mak..

[15]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[16]  Jerry Chun-Wei Lin,et al.  EHAUPM: Efficient High Average-Utility Pattern Mining With Tighter Upper Bounds , 2017, IEEE Access.

[17]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[18]  Yun Sing Koh,et al.  mHUIMiner: A Fast High Utility Itemset Mining Algorithm for Sparse Datasets , 2017, PAKDD.

[19]  Benjamin C. M. Fung,et al.  Mining High Utility Patterns in One Phase without Generating Candidates , 2016, IEEE Transactions on Knowledge and Data Engineering.

[20]  Tzung-Pei Hong,et al.  A New Method for Mining High Average Utility Itemsets , 2014, CISIM.

[21]  Philippe Fournier-Viger,et al.  Efficient Vertical Mining of High Average-Utility Itemsets Based on Novel Upper-Bounds , 2019, IEEE Transactions on Knowledge and Data Engineering.

[22]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[23]  Zhi-Hong Deng,et al.  Fast mining frequent itemsets using Nodesets , 2014, Expert Syst. Appl..

[24]  Jimmy Ming-Tai Wu,et al.  TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns , 2018, IEEE Access.

[25]  Unil Yun,et al.  Efficient algorithm for mining high average-utility itemsets in incremental transaction databases , 2017, Applied Intelligence.

[26]  Tzung-Pei Hong,et al.  An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[27]  Hamido Fujita,et al.  Damped window based high average utility pattern mining over data streams , 2017, Knowl. Based Syst..

[28]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[29]  Vincent S. Tseng,et al.  EFIM: a fast and memory efficient algorithm for high-utility itemset mining , 2016, Knowledge and Information Systems.

[30]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[31]  Mete CELIK,et al.  FIMHAUI: Fast Incremental Mining of High Average-Utility Itemsets , 2018, 2018 International Conference on Artificial Intelligence and Data Processing (IDAP).