论文信息 - Mining correlated high-utility itemsets using various measures

Mining correlated high-utility itemsets using various measures

Discovering high-utility itemsets consists of finding sets of items that yield a high profit in customer transaction databases. An important limitation of traditional high-utility itemset mining is that only the utility measure is used for assessing the interestingness of patterns. This leads to finding several itemsets that have a high profit but contain items that are weakly correlated. To address this issue, this paper proposes to integrate the concept of correlation in high-utility itemset mining to find profitable itemsets that are highly correlated, using the all-confidence and bond measures. An efficient algorithm named FCHM (Fast Correlated High-utility itemset Miner) is proposed to efficiently discover correlated high-utility itemsets. Two versions of the algorithm are proposed, named FCHMall-confidence and FCHMbond based on the allconfidence and bond measures, respectively. An experimental evaluation was done using four real-life benchmark datasets from the high-utility itemset mining litterature: mushroom, retail, kosarak and foodmart. Results show that FCHM is efficient and can prune a huge amount of weakly correlated high-utility itemsets.

[1] Ying Liu,et al. A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[2] Antonio Gomariz,et al. SPMF: a Java open-source pattern mining library , 2014, J. Mach. Learn. Res..

[3] Vincent S. Tseng,et al. FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[4] Mengchi Liu,et al. Mining high utility itemsets without candidate generation , 2012, CIKM.

[5] Vincent S. Tseng,et al. EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining , 2015, MICAI.

[6] Chedy Raïssi,et al. Mining Dominant Patterns in the Sky , 2011, 2011 IEEE 11th International Conference on Data Mining.

[7] Sangkyum Kim,et al. Mining Flipping Correlations from Large Datasets with Taxonomies , 2011, Proc. VLDB Endow..

[8] Philippe Fournier-Viger,et al. Mining Discriminative High Utility Patterns , 2016, ACIIDS.

[9] Edward Omiecinski,et al. Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[10] Sadok Ben Yahia,et al. Bridging Conjunctive and Disjunctive Search Spaces for Mining a New Concise and Exact Representation of Correlated Patterns , 2010, Discovery Science.

[11] Ho-Jin Choi,et al. A framework for mining interesting high utility patterns with a strong frequency affinity , 2011, Inf. Sci..

[12] Srikumar Krishnamoorthy,et al. Pruning strategies for mining high utility itemsets , 2015, Expert Syst. Appl..

[13] Philip S. Yu,et al. Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[14] Vincent S. Tseng,et al. Efficient Mining of High-Utility Sequential Rules , 2015, MLDM.

[15] Sadok Ben Yahia,et al. Key correlation mining by simultaneous monotone and anti-monotone constraints checking , 2015, SAC.

[16] Yu Liu,et al. BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap , 2014, Int. J. Data Warehous. Min..

[17] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.