An Instance Based Learning Model for Classification in Data Streams with Concept Change

Mining data streams has attracted the attention of the scientific community in recent years with the development of new algorithms for processing and sorting data in this area. Incremental learning techniques have been used extensively in these issues. A major challenge posed by data streams is that their underlying concepts can change over time. This research delves into the study of applying different techniques of classification for data streams, with a proposal based on similarity including a new methodology for detect and treatment of concept change. Previous experimentation are conduced with the model because it have some parameters to be tuned. A comparative statistical analysis are presented, that shows the performance of the proposed algorithm.

[1]  Gerhard Widmer Combining Robustness and Flexibility in Learning Drifting Concepts , 1994, ECAI.

[2]  Yan Li,et al.  A classification algorithm for noisy data streams , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[3]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[4]  Eyke Hüllermeier,et al.  An Efficient Algorithm for Instance-Based Learning on Data Streams , 2007, ICDM.

[5]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[6]  Tony R. Martinez,et al.  Value Difference Metrics for Continuously Valued Attributes , 1996 .

[7]  Xindong Wu,et al.  A Double-Window-Based Classification Algorithm for Concept Drifting Data Streams , 2010, 2010 IEEE International Conference on Granular Computing.

[8]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[9]  Eyke Hüllermeier,et al.  Efficient instance-based learning on data streams , 2007, Intell. Data Anal..

[10]  Satish S. Udpa,et al.  LEARN++: an incremental learning algorithm for multilayer perceptron networks , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[11]  Sattar Hashemi,et al.  Flexible decision tree for data stream classification in the presence of concept change, noise and missing values , 2009, Data Mining and Knowledge Discovery.

[12]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[13]  Sattar Hashemi,et al.  Adapted One-versus-All Decision Trees for Data Stream Classification , 2009, IEEE Transactions on Knowledge and Data Engineering.

[14]  Xindong Wu,et al.  Effective classification of noisy data streams with attribute-oriented dynamic classifier selection , 2006, Knowledge and Information Systems.

[15]  Ester Bernadó-Mansilla,et al.  Fuzzy-UCS: A Michigan-Style Learning Fuzzy-Classifier System for Supervised Learning , 2009, IEEE Transactions on Evolutionary Computation.

[16]  Kaustav Mukherjee,et al.  Application of the Gabriel graph to instance based learning algorithms , 2004 .

[17]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[18]  Marcos Salganicoff,et al.  Tolerating Concept and Sampling Shift in Lazy Learning Using Prediction Error Context Switching , 1997, Artificial Intelligence Review.

[19]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[20]  Xiaohong Huang,et al.  A Dynamic Online Traffic Classification Methodology Based on Data Stream Mining , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[21]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.