A Comparative Study of Frequent PatternRecognization Techniques from Stream Data

Mining frequent pattern from data stream is a challenging task. Finding frequent pattern from data streams have been of importance in many application such as stock market prediction, sensor data analysis, network traffic analysis, e-business and telecommunication data analysis. Frequent Pattern Stream tree [1] is used for maintaining frequent pattern over a period of time using modified FP tree algorithm. This approach maintains tilted time window at each node which consumes larger space. Compact Pattern Stream Tree [2] assumes that only current patterns are of importance and uses sliding window protocol for maintaining it. This approach does not give importance to past frequent patterns. Due to advancements in communication and storage technologies, large number of data streams has been generated by various applications and devices. Researchers have developed various methods to extract useful patterns from data streams. Many of the algorithms have been developed by extending the techniques that mines transaction data. Each methods work with different conditions such as offline streams, online streams, video streams, audio streams, etc. The performance and efficiency of the methods vary according to type of data streams. In this paper few recent and popular methods that extract patterns from stream data have been studied. Also a comparative analysis of different methods with reference to the conditions in which they work, and advantages/drawbacks of these methods are presented in this work.

[1]  Gopal K Gupta,et al.  Introduction to Data Mining with Case Studies , 2011 .

[2]  Carson Kai-Sang Leung,et al.  DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Young-Koo Lee,et al.  CP-Tree: A Tree Structure for Single-Pass Frequent Pattern Mining , 2008, PAKDD.

[4]  Hongjun Lu,et al.  False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams , 2004, VLDB.

[5]  Young-Koo Lee,et al.  Sliding window-based frequent pattern mining over data streams , 2009, Inf. Sci..

[6]  Xindong Wu,et al.  Robust ensemble learning for mining noisy data streams , 2011, Decis. Support Syst..

[7]  Vikram Pudi,et al.  Data Mining , 2008 .

[8]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[9]  Li Guo,et al.  Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams , 2010, 2010 IEEE International Conference on Data Mining.

[10]  沈錳坤 An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams , 2004 .

[11]  Arbee L. P. Chen,et al.  Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window , 2005, SDM.

[12]  Carson Kai-Sang Leung,et al.  Efficient Mining of Constrained Frequent Patterns from Streams , 2006, 2006 10th International Database Engineering and Applications Symposium (IDEAS'06).

[13]  Kuen-Fang Jea,et al.  Mining frequent patterns from dynamic data streams with data load management , 2012, J. Syst. Softw..

[14]  Won Suk Lee,et al.  Finding recent frequent itemsets adaptively over online data streams , 2003, KDD '03.

[15]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[16]  Suh-Yin Lee,et al.  An Efficient Algorithm for Mining Frequent Itemests over the Entire History of Data Streams , 2004 .

[17]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[18]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.