On learning guarantees to unsupervised concept drift detection on data streams

Abstract Motivated by the Statistical Learning Theory (SLT), which provides a theoretical framework to ensure when supervised learning algorithms generalize input data, this manuscript relies on the Algorithmic Stability framework to prove learning bounds for the unsupervised concept drift detection on data streams. Based on such proof, we also designed the Plover algorithm to detect drifts using different measure functions, such as Statistical Moments and the Power Spectrum. In this way, the criterion for issuing data changes can also be adapted to better address the target task. From synthetic and real-world scenarios, we observed that each data stream may require a different measure function to identify concept drifts, according to the underlying characteristics of the corresponding application domain. In addition, we discussed about the differences of our approach against others from literature, and showed illustrative results confirming the usefulness of our proposal.

[1]  Shie Mannor,et al.  Concept Drift Detection Through Resampling , 2014, ICML.

[2]  David B. Skillicorn,et al.  Classification Using Streaming Random Forests , 2011, IEEE Transactions on Knowledge and Data Engineering.

[3]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[4]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[5]  Lei Wang,et al.  Fuzzy Passive-Aggressive classification: A robust and efficient algorithm for online classification problems , 2013, Inf. Sci..

[6]  Mehmed M. Kantardzic,et al.  A grid density based framework for classifying streaming data in the presence of concept drift , 2015, Journal of Intelligent Information Systems.

[7]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[8]  Sonali Agarwal,et al.  Critical parameter analysis of Vertical Hoeffding Tree for optimized performance using SAMOA , 2017, Int. J. Mach. Learn. Cybern..

[9]  LastMark Online classification of nonstationary data streams , 2002 .

[10]  Mahmoud Reza Hashemi,et al.  A DCT based approach for detecting novelty and concept drift in data streams , 2010, 2010 International Conference of Soft Computing and Pattern Recognition.

[11]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[12]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[13]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[14]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Novelty detection algorithm for data streams multi-class problems , 2013, SAC '13.

[15]  Piotr Jȩdrzejowicz,et al.  Distance-Based Ensemble Online Classifier with Kernel Clustering , 2015, KES-IDT.

[16]  Bernhard Schölkopf,et al.  Statistical Learning Theory: Models, Concepts, and Results , 2008, Inductive Logic.

[17]  Holger Lange,et al.  Estimating determinism rates to detect patterns in geospatial datasets , 2015 .

[18]  Abad Miguel Ángel,et al.  Predicting recurring concepts on data-streams by means of a meta-model and a fuzzy similarity function , 2016 .

[19]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[22]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[23]  Rodrigo Fernandes de Mello,et al.  Applying a kernel function on time-dependent data to provide supervised-learning guarantees , 2017, Expert Syst. Appl..

[24]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[25]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  OLINDDA: a cluster-based approach for detecting novelty and concept drift in data streams , 2007, SAC '07.

[26]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[27]  Bartosz Krawczyk,et al.  One-class classifiers with incremental learning and forgetting for data streams with concept drift , 2015, Soft Comput..

[28]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection of Data Streams in a Dynamic Feature Space , 2010, ECML/PKDD.

[29]  Jesús S. Aguilar-Ruiz,et al.  A similarity-based approach for data stream classification , 2014, Expert Syst. Appl..

[30]  Ravi P. Agarwal,et al.  Dynamical systems and applications , 1995 .

[31]  Muhammad N. Marsono,et al.  Online data stream classification with incremental semi-supervised learning , 2015, CODS.

[32]  Rodrigo Fernandes de Mello,et al.  Multidimensional surrogate stability to detect data stream concept drift , 2017, Expert Syst. Appl..

[33]  Kazuoki Azuma WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .

[34]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[35]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[36]  Rodrigo Fernandes de Mello,et al.  Machine Learning: A Practical Approach on the Statistical Learning Theory , 2018 .

[37]  Mehmed M. Kantardzic,et al.  On the reliable detection of concept drift from streaming unlabeled data , 2017, Expert Syst. Appl..

[38]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[39]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[40]  João Gama,et al.  Forest trees for on-line data , 2004, SAC '04.

[41]  Eréndira Rendón,et al.  A comparison of internal and external cluster validation indexes , 2011 .

[42]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[43]  Peter Tiño,et al.  Concept drift detection for online class imbalance learning , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[44]  Stephen R. Marsland,et al.  A self-organising network that grows when required , 2002, Neural Networks.

[45]  C.E. Shannon,et al.  Communication in the Presence of Noise , 1949, Proceedings of the IRE.

[46]  Mary Ann Metzger,et al.  Applications of Nonlinear Dynamical Systems Theory in Developmental Psychology: Motor and Cognitive Development , 1997 .

[47]  Y. Yang,et al.  An optimal method for the power spectrum measurement , 2009 .

[48]  Rodrigo Fernandes de Mello,et al.  A self-organizing neural network for detecting novelties , 2007, SAC '07.