Classification and evaluation of data mining techniques for data stream requirements

In recent years, the management and processing of data streams has become a topic of active research in several fields of computer science, such as distributed systems, database systems, and data mining. In data streams' applications, such as network monitoring, telecommunication systems and sensor networks, because of online monitoring, answering to the user's queries should be time and space efficient. Generally, two main challenges are to design fast mining methods for data streams and the need to promptly detect changing concepts and data distribution because of highly dynamic nature of data streams. The goal of this article is to analyze and classify the application of diverse data mining techniques in different challenges of data stream mining. For this goal, this article tries to categorize and analyze related researches for better understanding and to reach a framework that can map data mining techniques to data stream mining challenges and requirements.

[1]  Mohamed Medhat Gaber,et al.  Adaptive mining techniques for data streams using algorithm output granularity , 2003 .

[2]  Carlo Zaniolo,et al.  An Adaptive Nearest Neighbor Classification Algorithm for Data Streams , 2005, PKDD.

[3]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[4]  Christos Faloutsos,et al.  Adaptive, Hands-Off Stream Mining , 2003, VLDB.

[5]  Mohamed Medhat Gaber,et al.  On-board Mining of Data Streams in Sensor Networks , 2005 .

[6]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[7]  Johannes Gehrke,et al.  Mining data streams under block evolution , 2002, SKDD.

[8]  Rajeev Motwani,et al.  Load Shedding Techniques for Data Stream Systems , 2003 .

[9]  Kevin Shaw,et al.  Stream Data Management , 2005, Advances in Database Systems.

[10]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[11]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[12]  Magdalena Balazinska,et al.  Clustering Events on Streams Using Complex Context Information , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[13]  Carlo Zaniolo,et al.  Mining techniques for data streams and sequences , 2005 .

[14]  LastMark Online classification of nonstationary data streams , 2002 .

[15]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[16]  Mark Last,et al.  Online classification of nonstationary data streams , 2002, Intell. Data Anal..

[17]  Li Tu,et al.  Density-based clustering for real-time stream data , 2007, KDD '07.

[18]  Jesús S. Aguilar-Ruiz,et al.  Discovering decision rules from numerical data streams , 2004, SAC '04.

[19]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[20]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[21]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[22]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[23]  Lei Liu,et al.  MobiMine: monitoring the stock market from a PDA , 2002, SKDD.

[24]  Philip S. Yu,et al.  Loadstar: Load Shedding in Data Stream Mining , 2005, VLDB.

[25]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[26]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[27]  Michael Stonebraker,et al.  Load Shedding on Data Streams , 2003 .

[28]  Philip S. Yu,et al.  Resource-Aware Mining with Variable Granularities in Data Streams , 2004, SDM.

[29]  Mohamed Medhat Gaber,et al.  Resource-aware knowledge discovery in data streams , 2004 .

[30]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[31]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[32]  Geoff Hulten,et al.  A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering , 2001, ICML.

[33]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[34]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.