A General Method for mining high-Utility itemsets with correlated measures

ABSTRACT Discovering high-utility itemsets from a transaction database is one of the important tasks in High-Utility Itemset Mining (HUIM). The discovered high-utility itemsets (HUIs) must meet a user-defined given minimum utility threshold. Several methods have been proposed to solve the problem efficiently. However, they focused on exploring and discovering the set of HUIs. This research proposes a more generalized approach to mine HUIs using any user-specified correlated measure, named the General Method for Correlated High-utility itemset Mining (GMCHM). This proposed approach has the ability to discover HUIs that are highly correlated, based on the all_confidence and bond measures (and 38 other correlated measures). Evaluations were carried out on the standard datasets for HUIM, such as Accidents, BMS_utility and Connect. The results proved the high effectiveness of GMCHM in terms of running time, memory usage and the number of scanned candidates.

[1]  Bay Vo,et al.  Interestingness measures for association rules: Combination between lattice and hash tables , 2011, Expert Syst. Appl..

[2]  Hamido Fujita,et al.  An efficient method for mining high utility closed itemsets , 2019, Inf. Sci..

[3]  Jerry Chun-Wei Lin,et al.  Mining correlated high-utility itemsets using various measures , 2020, Log. J. IGPL.

[4]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[5]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[6]  Vincent S. Tseng,et al.  EFIM: a fast and memory efficient algorithm for high-utility itemset mining , 2016, Knowledge and Information Systems.

[7]  Sangkyum Kim,et al.  Mining Flipping Correlations from Large Datasets with Taxonomies , 2011, Proc. VLDB Endow..

[8]  Graph-based Clustering , 2009, Encyclopedia of Database Systems.

[9]  Yun Sing Koh,et al.  Mining Local High Utility Itemsets , 2018, DEXA.

[10]  Philippe Fournier-Viger,et al.  FHM + : Faster High-Utility Itemset Mining Using Length Upper-Bound Reduction , 2016, IEA/AIE.

[11]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[12]  Vincent S. Tseng,et al.  EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining , 2015, MICAI.

[13]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[14]  Ali Kashif Bashir,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2013, ICIRA 2013.

[15]  O. Srinivasa Rao,et al.  An Improved UP-Growth High Utility Itemset Mining , 2012, ArXiv.

[16]  Benjamin C. M. Fung,et al.  Direct Discovery of High Utility Itemsets without Candidate Generation , 2012, 2012 IEEE 12th International Conference on Data Mining.

[17]  Rafael Morales Bueno,et al.  Mining interestingness measures for string pattern mining , 2012, Knowl. Based Syst..

[18]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[19]  Srikumar Krishnamoorthy,et al.  HMiner: Efficiently mining high utility itemsets , 2017, Expert Syst. Appl..

[20]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[21]  Rafael Morales Bueno,et al.  Mining interestingness measures for string pattern mining , 2010, Knowl. Based Syst..

[22]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[23]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[24]  Hiep Xuan Huynh,et al.  A Graph-based Clustering Approach to Evaluate Interestingness Measures: A Tool and a Comparative Study , 2007, Quality Measures in Data Mining.

[25]  Vincent S. Tseng,et al.  Mining high-utility itemsets in dynamic profit databases , 2019, Knowl. Based Syst..

[26]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[27]  Yu Liu,et al.  BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap , 2014, Int. J. Data Warehous. Min..

[28]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.