Novel hybrid pair recommendations based on a large-scale comparative study of concept drift detection

Abstract During the classification of streaming data, changes in the underlying distribution make formerly learned models insecure and imprecise, which is known as the concept drift phenomenon. Online learning derives information from a vast volume of stream data, which are usually affected by these changes in unforeseen ways and are currently generated primarily by the Internet of Things, social media applications, and the stock market. There is abundant literature focused on addressing concept drift using detectors, which essentially attempt to forecast the position of the change to improve the overall accuracy by altering the base learner. This paper presents novel hybrid pairs (classifier and detector) collected from a large-scale comparison of 15 drift detectors; drift detection method (DDM), early drift detection method (EDDM), EWMA for concept drift detection (ECDD), adaptive sliding window (ADWIN), geometrical moving average (GMA), drift detection methods based on Hoeffding’s bound (HDDMA and HDDMW), Fisher exact test drift detector (FTDD), fast Hoeffding drift detection method (FHDDM), Page–Hinkley test (PH), reactive drift detection method (RDDM), SEED, statistical test of equal proportions (STEPD), SeqDrift2, and Wilcoxon rank-sum test drift detector (WSTD) and six classifiers; Naive Bayes (NB), Hoeffding tree (HT), Hoeffding option tree (HOT), Perceptron (P), decision stump (DS), and k-nearest neighbour (KNN), to determine and recommend the best pair in accordance with the properties of the dataset. The objective of this study is to assess the contribution of a detector to a classifier and obtain the most efficient matched pairs. Through these pairwise comparison experiments, the accuracy rates and evaluation times of the pairs, as well as their false positives, true negatives, false negatives, true positives, drift detection delay, and the MCC. Additionally, the Nemenyi test is employed to compare the pairs against other methods to identify the method(s) for which there is a statistical difference. The results of the experiments indicate that the most efficient pairs—which differed for each dataset type and size—primarily include the HDDMA, RDDM, WSTD, and FHDDM detectors.

[1]  N. Fisher,et al.  Probability Inequalities for Sums of Bounded Random Variables , 1994 .

[2]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[3]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[4]  Mehmed M. Kantardzic,et al.  On the reliable detection of concept drift from streaming unlabeled data , 2017, Expert Syst. Appl..

[5]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[6]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[7]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[8]  Roberto Souto Maior de Barros,et al.  A large-scale comparison of concept drift detectors , 2018, Inf. Sci..

[9]  Abad Miguel Ángel,et al.  Predicting recurring concepts on data-streams by means of a meta-model and a fuzzy similarity function , 2016 .

[10]  Kyosuke Nishida,et al.  Learning and Detecting Concept Drift , 2008 .

[11]  Yun Sing Koh,et al.  Detecting concept change in dynamic data streams , 2013, Machine Learning.

[12]  K. Ghédira,et al.  Ensemble classifiers for drift detection and monitoring in dynamical environments , 2013 .

[13]  BifetAlbert,et al.  MOA: Massive Online Analysis , 2010 .

[14]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[15]  P. K. Srimani,et al.  Performance analysis of Hoeffding trees in data streams by using massive online analysis framework , 2015, Int. J. Data Min. Model. Manag..

[16]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[17]  Gillian Dobbie,et al.  Recurring Concept Meta-learning for Evolving Data Streams , 2019, Expert Syst. Appl..

[18]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[19]  Roberto Souto Maior de Barros,et al.  Wilcoxon Rank Sum Test Drift Detector , 2018, Neurocomputing.

[20]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[21]  Michel C. A. Klein,et al.  Concept drift and how to identify it , 2011, J. Web Semant..

[22]  Allen Kent,et al.  Machine literature searching VIII. Operational criteria for designing information retrieval systems , 1955 .

[23]  Koichiro Yamauchi,et al.  Detecting Concept Drift Using Statistical Testing , 2007, Discovery Science.

[24]  Roberto Souto Maior de Barros,et al.  A comparative study on concept drift detectors , 2014, Expert Syst. Appl..

[25]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[26]  Roberto Souto Maior de Barros,et al.  RDDM: Reactive drift detection method , 2017, Expert Syst. Appl..

[27]  R. Fisher On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2018, Journal of the Royal Statistical Society Series A (Statistics in Society).

[28]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[29]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[30]  R. Fisher On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2010 .

[31]  J. C. Schlimmer,et al.  Incremental learning from noisy data , 2004, Machine Learning.

[32]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[33]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[34]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[35]  Roberto Souto Maior de Barros,et al.  Concept drift detection based on Fisher's Exact test , 2018, Inf. Sci..

[36]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[37]  Pat Langley,et al.  Induction of One-Level Decision Trees , 1992, ML.

[38]  Geoffrey I. Webb,et al.  Analyzing concept drift and shift from sample data , 2018, Data Mining and Knowledge Discovery.

[39]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[40]  S. W. Roberts,et al.  Control Chart Tests Based on Geometric Moving Averages , 2000, Technometrics.

[41]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[42]  FreundYoav,et al.  Large Margin Classification Using the Perceptron Algorithm , 1999 .

[43]  Gillian Dobbie,et al.  Drift Detection Using Stream Volatility , 2015, ECML/PKDD.