论文信息 - Efficient Mining of Temporal High Utility Itemsets from Data streams

Efficient Mining of Temporal High Utility Itemsets from Data streams

itemsets are considered as the different values of individual items as utilities, and utility mining aims at identifying the itemsets with high utilities. The temporal high utility itemsets are the itemsets with support larger than a pre-specified threshold in current time window of data stream. Discovery of temporal high utility itemsets is an important process for mining interesting patterns like association rules from data streams. In this paper, we propose a novel method, namely THUI (Temporal High Utility Itemsets) -Mine, for mining temporal high utility itemsets from data streams efficiently and effectively. To our best knowledge, this is the first work on mining temporal high utility itemsets from data streams. The novel contribution of THUI-Mine is that it can effectively identify the temporal high utility itemsets by generating fewer temporal high transaction-weighted utilization 2-itemsets such that the execution time can be reduced substantially in mining all high utility itemsets in data streams. In this way, the process of discovering all temporal high utility itemsets under all time windows of data streams can be achieved effectively with limited memory space, less candidate itemsets and CPU I/O time. This meets the critical requirements on time and space efficiency for mining data streams. The experimental results show that THUI-Mine can discover the temporal high utility itemsets with higher performance and less candidate itemsets compared to other algorithms under various experimental conditions.

[1] Heikki Mannila,et al. Rule Discovery from Time Series , 1998, KDD.

[2] Qiang Yang,et al. Mining high utility itemsets , 2003, Third IEEE International Conference on Data Mining.

[3] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[4] A. Choudhary,et al. A fast high utility itemsets mining algorithm , 2005, UBDM '05.

[5] Philip S. Yu,et al. Resource-Aware Mining with Variable Granularities in Data Streams , 2004, SDM.

[6] Ming-Syan Chen,et al. Sliding-window filtering: an efficient algorithm for incremental mining , 2001, CIKM '01.

[7] Jiawei Han,et al. Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[8] Jun-Lin Lin,et al. Mining association rules: anti-skew algorithms , 1998, Proceedings 14th International Conference on Data Engineering.

[9] Philip S. Yu,et al. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[10] Philip S. Yu,et al. Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[11] Philip S. Yu,et al. A Regression-Based Temporal Pattern Mining Scheme for Data Streams , 2003, VLDB.

[12] Sushil Jajodia,et al. Testing complex temporal relationships involving multiple granularities and its application to data mining (extended abstract) , 1996, PODS.

[13] Shamkant B. Navathe,et al. An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[14] Ramakrishnan Srikant,et al. Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[15] Necip Fazil Ayan,et al. An efficient algorithm to update large itemsets with early pruning , 1999, KDD '99.

[16] David Wai-Lok Cheung,et al. A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[17] Heikki Mannila,et al. Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[18] Cory J. Butz,et al. A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.