Frequent Items Mining on Data Stream Based on Weighted Counts

Frequent items mining is an important data mining task with many real-world applications. By considering different weights of the items, weighted frequent items mining can discover more important knowledge compared to traditional frequent patterns mining. In this paper, we presented a new algorithm called count-MH to discover weighted frequent items over data streams, the proposed method is based on weighted factor and hash function where its space complexity is, the processing time for each item is in average. Experimental results show that count-MH is efficient for frequent items mining.

[1]  Aoying Zhou,et al.  Dynamically maintaining frequent items over a data stream , 2003, CIKM '03.

[2]  Yossi Matias,et al.  New sampling-based summary statistics for improving approximate query answers , 1998, SIGMOD '98.

[3]  Hongyan Liu,et al.  Error-Adaptive and Time-Aware Maintenance of Frequency Counts over Data Streams , 2006, WAIM.

[4]  Erik D. Demaine,et al.  Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[5]  Rajeev Motwani,et al.  Computing Iceberg Queries Efficiently , 1998, VLDB.

[6]  Bill Lin,et al.  Adaptive Frequency Counting over Bursty Data Streams , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[7]  George Varghese,et al.  New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice , 2003, TOCS.

[8]  Li Jian-Zhong,et al.  An Efficient Algorithm for Mining Approximate Frequent Item over Data Streams , 2007 .

[9]  Jayadev Misra,et al.  Finding Repeated Elements , 1982, Sci. Comput. Program..

[10]  Toon Calders,et al.  Mining Frequent Itemsets in a Stream , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[11]  Richard M. Karp,et al.  A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[12]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[13]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[14]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..