Advances in data stream mining

Mining data streams has been a focal point of research interest over the past decade. Hardware and software advances have contributed to the significance of this area of research by introducing faster than ever data generation. This rapidly generated data has been termed as data streams. Credit card transactions, Google searches, phone calls in a city, and many others\are typical data streams. In many important applications, it is inevitable to analyze this streaming data in real time. Traditional data mining techniques have fallen short in addressing the needs of data stream mining. Randomization, approximation, and adaptation have been used extensively in developing new techniques or adopting exiting ones to enable them to operate in a streaming environment. This paper reviews key milestones and state of the art in the data stream mining area. Future insights are also be presented. © 2011 Wiley Periodicals, Inc.

[1]  Mohamed Medhat Gaber,et al.  Distributed data stream classification for wireless sensor networks , 2010, SAC '10.

[2]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[3]  Philip S. Yu,et al.  A Holistic Approach for Resource-aware Adaptive Data Stream Mining , 2006, New Generation Computing.

[4]  Lei Liu,et al.  MobiMine: monitoring the stock market from a PDA , 2002, SKDD.

[5]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[6]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[7]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[8]  Mohamed Medhat Gaber,et al.  Knowledge Discovery from Sensor Data , 2008 .

[9]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[10]  Mohamed Medhat Gaber,et al.  A Survey of Classification Methods in Data Streams , 2007, Data Streams - Models and Algorithms.

[11]  Mohamed Medhat Gaber,et al.  Resource-aware Online Data Mining in Wireless Sensor Networks , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[12]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[13]  Kun Liu,et al.  VEDAS: A Mobile and Distributed Data Stream Mining System for Real-Time Vehicle Monitoring , 2004, SDM.

[14]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[15]  Bhavani M. Thuraisingham,et al.  Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams , 2009, ECML/PKDD.

[16]  Geoff Hulten,et al.  A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering , 2001, ICML.

[17]  Longzhuang Li,et al.  Probabilistic discovery of motifs in water level , 2009, 2009 IEEE International Conference on Information Reuse & Integration.

[18]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[19]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[20]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[21]  Mohamed Medhat Gaber,et al.  Data stream mining: from theory to applications and from stationary to mobile , 2010 .

[22]  Mohamed Medhat Gaber,et al.  Learning from Data Streams: Processing Techniques in Sensor Networks , 2007 .

[23]  Mohamed Medhat Gaber,et al.  Open Mobile Miner: a toolkit for mobile data stream mining , 2009, KDD 2009.

[24]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[25]  Eamonn J. Keogh,et al.  Probabilistic discovery of time series motifs , 2003, KDD '03.

[26]  Mohamed Medhat Gaber,et al.  Data Stream Mining Using Granularity-Based Approach , 2009, Foundations of Computational Intelligence.