Interactive Mining of Frequent Itemsets over Arbitrary Time Intervals in a Data Stream

Mining frequent patterns in a data stream is very challenging for the high complexity of managing patterns with bounded memory against the unbounded data. While many approaches assume a fixed support threshold, a changeable threshold is more realistic, considering the rapid updating of the streaming transactions in practice. Additionally, mining of itemsets over various time granularities rather than over the entire stream may provide more flexibility for many applications. Therefore, we propose a interactive mechanism to perform the mining of frequent itemsets over arbitrary time intervals in the data stream, allowing a changeable support threshold. A synopsis vector having tilted-time tables is devised for maintaining statistics of past transactions for support computation over user-specified time periods. The extensive experiments over various parameter settings demonstrate that our approach is efficient and capable of mining frequent itemsets in the data stream interactively, with variable support thresholds.

[1]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[2]  Aoying Zhou,et al.  Dynamically maintaining frequent items over a data stream , 2003, CIKM '03.

[3]  Moses Charikar,et al.  Finding frequent items in data streams , 2004, Theor. Comput. Sci..

[4]  Suh-Yin Lee,et al.  An Efficient Algorithm for Mining Frequent Itemests over the Entire History of Data Streams , 2004 .

[5]  Yixin Chen,et al.  Multi-Dimensional Regression Analysis of Time-Series Data Streams , 2002, VLDB.

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Naren Ramakrishnan,et al.  Compression, clustering, and pattern discovery in very high-dimensional discrete-attribute data sets , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[9]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[10]  Nan Jiang,et al.  Research issues in data stream association rule mining , 2006, SGMD.

[11]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[12]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[13]  Won Suk Lee,et al.  estWin: Online data stream mining of recent frequent itemsets by sliding window method , 2005, J. Inf. Sci..

[14]  Suh-Yin Lee,et al.  Interactive sequence discovery by incremental mining , 2004, Inf. Sci..

[15]  Christian Hidber,et al.  Association Rule Mining , 2017 .

[16]  Graham Cormode,et al.  What's hot and what's not: tracking most frequent items dynamically , 2003, TODS.

[17]  Philip S. Yu,et al.  Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[18]  Suh-Yin Lee,et al.  A New Algorithm for Maintaining Closed Frequent Itemsets in Data Streams by Incremental Updates , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[19]  Ming-Yen Lin,et al.  Variable Support Mining of Frequent Itemsets over Data Streams Using Synopsis Vectors , 2006, PAKDD.