KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift

Data Mining in non-stationary data streams is gaining more attentionrecently, especially in the context of Internet of Things and Big Data. It is a highly challenging task, since the fundamentally different typesof possibly occurring drift undermine classical assumptions such asi.i.d. data or stationary distributions. Available algorithms are either struggling with certain forms of drift or require a priori knowledge in terms of a task specific setting. We propose the Self Adjusting Memory (SAM) model for the k Nearest Neighbor (kNN) algorithm since kNN constitutes a proven classifier within the streaming setting. SAM-kNN can deal with heterogeneous concept drift, i.e different drift types and rates, using biologically inspiredmemory models and their coordination. It can be easilyapplied in practice since an optimization of the meta parameters is not necessary. The basic idea is to construct dedicated models for thecurrent and former concepts and apply them according tothe demands of the given situation. An extensive evaluation on various benchmarks, consisting of artificial streamswith known drift characteristics as well as real world datasets is conducted. Thereby, we explicitly add new benchmarks enabling a precise performance evaluation on multiple types of drift. The highly competitive results throughout all experiments underline the robustness of SAM-kNN as well as its capabilityto handle heterogeneous concept drift.

[1]  Ludmila I. Kuncheva,et al.  Adaptive Learning Rate for Online Linear Discriminant Classifiers , 2008, SSPR/SPR.

[2]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[3]  Gregory Ditzler,et al.  Learning in Nonstationary Environments: A Survey , 2015, IEEE Computational Intelligence Magazine.

[4]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[5]  M. Anusha,et al.  Big Data-Survey , 2016 .

[6]  Grigorios Tsoumakas,et al.  Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification , 2011, IJCAI.

[7]  Analía Amandi,et al.  eTeacher: Providing personalized assistance to e-learning students , 2008, Comput. Educ..

[8]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[9]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[10]  Richard C. Atkinson,et al.  Human Memory: A Proposed System and its Control Processes , 1968, Psychology of Learning and Motivation.

[11]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[12]  S. Venkatasubramanian,et al.  An Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data Streams , 2006 .

[13]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[14]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[15]  Indre Zliobaite,et al.  How good is the Electricity benchmark for evaluating concept drift adaptation , 2013, ArXiv.

[16]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[17]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[18]  Geoff Holmes,et al.  Efficient data stream classification via probabilistic adaptive windows , 2013, SAC '13.

[19]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[20]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[21]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[22]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[23]  Ck Cheng,et al.  The Age of Big Data , 2015 .

[24]  Carlo Zaniolo,et al.  An Adaptive Nearest Neighbor Classification Algorithm for Data Streams , 2005, PKDD.

[25]  Y. Dudai The neurobiology of consolidations, or, how stable is the engram? , 2004, Annual review of psychology.

[26]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[27]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[28]  Li Guo,et al.  Enabling Fast Lazy Learning for Data Streams , 2011, 2011 IEEE 11th International Conference on Data Mining.

[29]  João Gama,et al.  Accurate decision trees for mining high-speed data streams , 2003, KDD '03.

[30]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[31]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[32]  Antoine Cornuéjols,et al.  Online Learning: Searching for the Best Forgetting Strategy under Concept Drift , 2013, ICONIP.

[33]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[34]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[35]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[36]  Lida Xu,et al.  The internet of things: a survey , 2014, Information Systems Frontiers.

[37]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[38]  Stuart J. Russell,et al.  Experimental comparisons of online and batch versions of bagging and boosting , 2001, KDD '01.