A linear regression-based frequent itemset forecast algorithm for stream data

Data mining deals with extracting or mining knowledge from large and infinite amount of stream data. It also handles the data quality with limited volume of disk or memory. In such traditional transaction environment it is impossible to perform frequent items mining because it requires analyzing which item is a frequent one to continuously incoming stream data and which is probable to become a frequent item. This paper proposes a way to predict frequent items using regression model to the continuously incoming real time stream data. By establishing the regression model from the stream data, it may be used for prediction of uncertain items. After gathering real-time stream data through sliding window, algorithm FIM-2DS computes support for appointed sequence and describes linear equation to forecast sequence trends in the future.