Interactive Mining of Frequent Patterns in a Data Stream of Time-Fading Models

Mining frequent itemsets in data streams is an emergent research topic. Previous approaches generally assume a fixed minimum support threshold on mining patterns in the stream. However, allowing users to interactively specify minimum supports is more desirable in practice. In addition, the importance of stream data tends to decrease as time goes by. Thus, mining frequent patterns in streams of time-fading models is important for many applications. In this paper, we propose an algorithm that allows users to change the minimum support at any time in mining recently frequent itemsets in data streams of time fading models. A synopsis vector with a support decaying mechanism is constructed to summarizing past transactions. A batch of transactions will be incorporated into the synopsis for potential re-mining if the support changes. Extensive and comprehensive experiments were conducted over various datasets. The experimental results show that our approach has high precision and recall for mining recently frequent itemsets over the data streams with variable support thresholds.

[1]  Ming-Yen Lin,et al.  Interactive Mining of Frequent Itemsets over Arbitrary Time Intervals in a Data Stream , 2008, ADC.

[2]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[3]  Won Suk Lee,et al.  A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams , 2004, J. Inf. Sci. Eng..

[4]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[5]  Naren Ramakrishnan,et al.  Compression, clustering, and pattern discovery in very high-dimensional discrete-attribute data sets , 2005, IEEE Transactions on Knowledge and Data Engineering.

[6]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[7]  Wilfred Ng,et al.  Maintaining Frequent Itemsets over High-Speed Data Streams , 2006, PAKDD.

[8]  Philip S. Yu,et al.  Catch the moment: maintaining closed frequent itemsets over a data stream sliding window , 2006, Knowledge and Information Systems.

[9]  Suh-Yin Lee,et al.  Interactive sequence discovery by incremental mining , 2004, Inf. Sci..

[10]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[11]  Philip S. Yu,et al.  Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[12]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[13]  Christian Hidber,et al.  Association Rule Mining , 2017 .

[14]  Won Suk Lee,et al.  Finding recent frequent itemsets adaptively over online data streams , 2003, KDD '03.

[15]  Suh-Yin Lee,et al.  An Efficient Algorithm for Mining Frequent Itemests over the Entire History of Data Streams , 2004 .

[16]  Zhigang Chen,et al.  A Mining Maximal Frequent Itemsets over the Entire History of Data Streams , 2009, 2009 First International Workshop on Database Technology and Applications.