Dedicated Memory Models for Continual Learning in the Presence of Concept Drift

Data Mining in non-stationary data streams is gaining more attention recently, especially in the context of Internet of Things and Big Data. It is a highly challenging task since the different types of possibly occurring concept drift undermine classical assumptions such as data independence or stationary distributions. We propose the Self Adjusting Memory (SAM) model, which can deal with heterogeneous concept drift, i.e different types and rates, using biologically inspired memory models and their coordination. The idea is to construct dedicated models for the current and former concepts and apply them according to the given situation. This general approach can be combined with various classifiers meeting certain conditions, which we specify in this contribution. SAM is easy to use in practice since a task specific optimization of the meta parameters is not necessary. We recap the merits of our architecture with the k Nearest Neighbor classifier and evaluate it on artificial as well as real world benchmarks. SAM’s highly competitive results throughout all experiments underline its robustness as well as its capability to handle heterogeneous concept drift.

[1]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[2]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[3]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[4]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[6]  Geoff Holmes,et al.  Efficient data stream classification via probabilistic adaptive windows , 2013, SAC '13.

[7]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[8]  Heiko Wersing,et al.  KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[9]  Analía Amandi,et al.  eTeacher: Providing personalized assistance to e-learning students , 2008, Comput. Educ..

[10]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[11]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[12]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[13]  Antoine Cornuéjols,et al.  Online Learning: Searching for the Best Forgetting Strategy under Concept Drift , 2013, ICONIP.

[14]  Antonio Iera,et al.  The Internet of Things: A survey , 2010, Comput. Networks.