Mining Discriminative High Utility Patterns

Recently, many approaches for high utility pattern mining (HUPM) have been proposed, but most of them aim at mining high-utility patterns (HUPs) instead of frequent ones. The major drawback is that any combination of a low-utility item with a very high utility pattern is regarded as a HUP, even if this combination is infrequent and contains items that rarely co-occur. Thus, the HUIPM algorithm was proposed to derive high utility interesting patterns (HUIPs) with strong frequency affinity. However, it recursively constructs a series of conditional trees to produce candidates, and then derive the HUIPs. It is time-consuming and may lead to a combinatorial explosion. In this paper, a Fast algorithm for mining Discriminative High Utility Patterns with strong frequency affinity (FDHUP) is proposed by considering both the utility and frequency affinity constraints. Two compact structures named EI-table and FU-table, and two pruning strategies are designed to reduce the search space, and efficiently and effectively discover DHUPs. Experimental results show that the proposed FDHUP algorithm considerably outperforms the state-of-the-art HUIPM algorithm in all datasets.

[1]  Young-Koo Lee,et al.  Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[3]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[4]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[5]  Ho-Jin Choi,et al.  A framework for mining interesting high utility patterns with a strong frequency affinity , 2011, Inf. Sci..

[6]  Philip S. Yu,et al.  UP-Growth: an efficient algorithm for high utility itemset mining , 2010, KDD.

[7]  Mengchi Liu,et al.  Mining high utility itemsets without candidate generation , 2012, CIKM.

[8]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[9]  Qiang Yang,et al.  Mining high utility itemsets , 2003, Third IEEE International Conference on Data Mining.

[10]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[11]  Tzung-Pei Hong,et al.  Incrementally Updating High-Utility Itemsets with Transaction Insertion , 2014, ADMA.

[12]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[13]  Tzung-Pei Hong,et al.  Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds , 2015, C3S2E.

[14]  Rupali A. Mahajan,et al.  Survey on Mining High Utility Itemset from Transactional Database , 2013 .

[15]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[16]  Philip S. Yu,et al.  Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.