AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR UTILITY MINING

Utility mining in knowledge discovery has recently become a prominent research issue due to its many practical applications. A high utility itemset in utility mining considers not only quantities but also profits of items in transactions. Most of previous approaches were based on the traditional utility upper bound model to find high utility itemsets in databases. By using the model, however, a huge number of candidates have to be generated, and a good deal of time to count utility upper bounds of itemsets has to be needed for mining. In this paper, we thus propose a level-wise mining approach to find efficiently high utility itemsets in databases. In particular, a pruning strategy is designed to gradually cause better utility upper bounds of itemsets in passes. Also, data size could be gradually reduced to save data scan time. Finally, the experimental results on synthetic datasets and a real dataset show the proposed approach outperforms the traditional two-phase utility mining approach in pruning effect and execution efficiency.

[1]  Keqing Li,et al.  Mining High Utility Itemsets in Large High Dimensional Data , 2008, First International Workshop on Knowledge Discovery and Data Mining (WKDD 2008).

[2]  Raj P. Gopalan,et al.  CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach , 2007, 7th IEEE International Conference on Computer and Information Technology (CIT 2007).

[3]  A. Choudhary,et al.  A fast high utility itemsets mining algorithm , 2005, UBDM '05.

[4]  Jieh-Shan Yeh,et al.  Efficient algorithms for incremental utility mining , 2008, ICUIMC '08.

[5]  Jianying Hu,et al.  High-utility pattern mining: A method for discovery of high-utility item sets , 2007, Pattern Recognit..

[6]  Tzung-Pei Hong,et al.  Discovery of high utility itemsets from on-shelf time periods of products , 2011, Expert Syst. Appl..

[7]  Chin-Chen Chang,et al.  Isolated items discarding strategy for discovering high utility itemsets , 2008, Data Knowl. Eng..

[8]  Vincent S. Tseng,et al.  Efficient Mining of Temporal High Utility Itemsets from Data streams , 2006 .

[9]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[10]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[11]  Vincent S. Tseng,et al.  MINING TEMPORAL RARE UTILITY ITEMSETS IN LARGE DATABASES USING RELATIVE UTILITY THRESHOLDS , 2008 .

[12]  Howard J. Hamilton,et al.  Mining itemset utilities from transaction databases , 2006, Data Knowl. Eng..

[13]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[14]  Qiang Yang,et al.  Mining high utility itemsets , 2003, Third IEEE International Conference on Data Mining.