Mining recent maximal frequent itemsets over data streams with sliding window

The huge number of data streams makes it impossible to mine recent frequent itemsets. Due to the maximal frequent itemsets can perfectly imply all the frequent itemsets and the number is much smaller, therefore, the time cost and the memory usage for mining maximal frequent itemsets are much more efficient. This paper proposes an improved method called Recent Maximal Frequent Itemsets Mining (RMFIsM) to mine recent maximal frequent itemsets over data streams with sliding window. The RMFIsM method uses two matrixes to store the information of data streams, the first matrix stores the information of each transaction and the second one stores the frequent 1-itemsets. The frequent p-itemsets are mined with “extension” process of frequent 2-itemsets, and the maximal frequent itemsets are obtained by deleting the sub-itemsets of long frequent itemsets. Finally, the performance of the RMFIsM method is conducted by a series of experiments, the results show that the proposed RMFIsM method can mine recent maximal frequent itemsets efficiently.

[1]  Fan Guidan,et al.  A Frequent Itemsets Mining Algorithm Based on Matrix in Sliding Window over Data Streams , 2013, 2013 Third International Conference on Intelligent System Design and Engineering Applications.

[2]  Won Suk Lee,et al.  Finding recent frequent itemsets adaptively over online data streams , 2003, KDD '03.

[3]  Toon Calders,et al.  Mining Frequent Itemsets in a Stream , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[4]  Ming-Yen Lin,et al.  Interactive Mining of Frequent Patterns in a Data Stream of Time-Fading Models , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[5]  Juan Li,et al.  TDMCS: an efficient method for mining closed frequent patterns over data streams based on time decay model , 2017, Int. Arab J. Inf. Technol..

[6]  Mohammad Hadi Sadreddini,et al.  A sliding window based algorithm for frequent closed itemset mining over data streams , 2013, J. Syst. Softw..

[7]  Suh-Yin Lee,et al.  Online mining (recently) maximal frequent itemsets over data streams , 2005, 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA'05).

[8]  Mohammad Hadi Sadreddini,et al.  An Efficient Sliding Window Based Algorithm for Adaptive Frequent Itemset Mining over Data Streams , 2013, J. Inf. Sci. Eng..

[9]  Xindong Wu,et al.  Mining maximal frequent itemsets from data streams , 2007, J. Inf. Sci..

[10]  Zhi-Hong Deng,et al.  DiffNodesets: An efficient structure for fast mining frequent itemsets , 2015, Appl. Soft Comput..

[11]  Won Suk Lee,et al.  CP-tree: An adaptive synopsis structure for compressing frequent itemsets over online data streams , 2014, Inf. Sci..

[12]  Mohammad Hadi Sadreddini,et al.  A dynamic layout of sliding window for frequent itemset mining over data streams , 2012, J. Syst. Softw..

[13]  Yanjun Wei,et al.  An Efficient Algorithm for Mining Maximal Frequent Patterns over Data Streams , 2015, 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics.