RDDM: Reactive drift detection method

Abstract Concept drift detectors are online learning software that mostly attempt to estimate the drift positions in data streams in order to modify the base classifier after these changes and improve accuracy. This is very important in applications such as the detection of anomalies in TCP/IP traffic and/or frauds in financial transactions. Drift Detection Method (DDM) is a simple, efficient, well-known method whose performance is often impaired when the concepts are very long. This article proposes the Reactive Drift Detection Method (RDDM) , which is based on DDM and, among other modifications, discards older instances of very long concepts aiming to detect drifts earlier, improving the final accuracy. Experiments run in MOA, using abrupt and gradual concept drift versions of different dataset generators and sizes (48 artificial datasets in total), as well as three real-world datasets, suggest RDDM beats the accuracy results of DDM, ECDD, and STEPD in most scenarios.

[1]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[2]  Roberto Souto Maior de Barros,et al.  A Lightweight Concept Drift Detection Ensemble , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[3]  Gillian Dobbie,et al.  Detecting Volatility Shift in Data Streams , 2014, 2014 IEEE International Conference on Data Mining.

[4]  Roberto Souto Maior de Barros,et al.  RCD: A recurring concept drift framework , 2013, Pattern Recognit. Lett..

[5]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[6]  Roberto Souto Maior de Barros,et al.  A comparative study on concept drift detectors , 2014, Expert Syst. Appl..

[7]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[8]  Geoff Holmes,et al.  Pitfalls in Benchmarking Data Stream Classification and How to Avoid Them , 2013, ECML/PKDD.

[9]  Koichiro Yamauchi,et al.  Detecting Concept Drift Using Statistical Testing , 2007, Discovery Science.

[10]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[11]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[12]  Roberto Souto Maior de Barros,et al.  Optimizing the Parameters of Drift Detection Methods Using a Genetic Algorithm , 2015, ICTAI.

[13]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[14]  Yun Sing Koh,et al.  Detecting concept change in dynamic data streams , 2013, Machine Learning.

[15]  Lei Du,et al.  A Selective Detector Ensemble for Concept Drift Detection , 2015, Comput. J..

[16]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[17]  José del Campo-Ávila,et al.  Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds , 2015, IEEE Transactions on Knowledge and Data Engineering.

[18]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[19]  Marcus A. Maloof,et al.  Paired Learners for Concept Drift , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[20]  Roberto Souto Maior de Barros,et al.  A Boosting-like Online Learning Ensemble , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[21]  Liva Ralaivola,et al.  Empirical Bernstein Inequalities for U-Statistics , 2010, NIPS.

[22]  Ronald L. Rivest,et al.  Introduction to Algorithms, 3rd Edition , 2009 .

[23]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[24]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[25]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[26]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[27]  Roberto Souto Maior de Barros,et al.  Speeding Up Recovery from Concept Drifts , 2014, ECML/PKDD.

[28]  Yun Sing Koh,et al.  One Pass Concept Change Detection for Data Streams , 2013, PAKDD.

[29]  S. W. Roberts Control chart tests based on geometric moving averages , 2000 .

[30]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[31]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[32]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[33]  Herna L. Viktor,et al.  Fast Hoeffding Drift Detection Method for Evolving Data Streams , 2016, ECML/PKDD.

[34]  Marc Boullé,et al.  Concept drift detection using supervised bivariate grids , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).