Fast Mining of Closed Frequent Itemsets in Data Streams

With the emergence of large-volume and high-speed streaming data, traditional techniques for mining closed frequent itemsets has become inefficient. Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, a combinative data structure is designed by using an effective bit-victor to represent items and an extended dictionary frequent item list to record the current closed frequent information in streams. For tremendous reduction of search space, some new search strategies are proposed to avoid a large number of intermediate itemsets generated. Meanwhile, some new pruning strategies are also proposed for the purpose of efficiently and dynamically maintaining of all the closure check operations. Experimental results show that the method proposed is efficient in time, with sound scalability as the number of transactions processed increases and adapts rapidly to the changes in data streams.