DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift

When learning from streaming data, a change in the data distribution, also known as concept drift, can render a previously-learned model inaccurate and require training a new model. We present an adaptive learning algorithm that extends previous drift-detection-based methods by incorporating drift detection into a broader stable-state/reactive-state process. The advantage of our approach is that we can use aggressive drift detection in the stable state to achieve a high detection rate, but mitigate the false positive rate of standalone drift detection via a reactive state that reacts quickly to true drifts while eliminating most false positives. The algorithm is generic in its base learner and can be applied across a variety of supervised learning problems. Our theoretical analysis shows that the risk of the algorithm is competitive to an algorithm with oracle knowledge of when (abrupt) drifts occur. Experiments on synthetic and real datasets with concept drifts confirm our theoretical analysis.

[1]  Lijun Zhang,et al.  Minimizing Adaptive Regret with One Gradient per Iteration , 2018, IJCAI.

[2]  Ivan Koychev,et al.  Gradual Forgetting for Adaptation to Concept Drift , 2000 .

[3]  Jerzy Stefanowski,et al.  Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Marcus A. Maloof,et al.  Paired Learners for Concept Drift , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5]  Herna L. Viktor,et al.  A Framework for Classification in Data Streams Using Multi-strategy Learning , 2016, DS.

[6]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[7]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[8]  Thomas Hofmann,et al.  Starting Small - Learning with Adaptive Sample Sizes , 2016, ICML.

[9]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[10]  Herna Viktor,et al.  McDiarmid Drift Detection Methods for Evolving Data Streams , 2017, 2018 International Joint Conference on Neural Networks (IJCNN).

[11]  Zhi-Hua Zhou,et al.  Handling concept drift via model reuse , 2018, Machine Learning.

[12]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[13]  João Gama,et al.  Change Detection in Learning Histograms from Data Streams , 2007, EPIA Workshops.

[14]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[15]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[16]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[17]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[18]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[19]  Srikanta Tirthapura,et al.  Variance-Reduced Stochastic Gradient Descent on Streaming Data , 2018, NeurIPS.

[20]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[21]  Eamonn J. Keogh,et al.  The UCR time series archive , 2018, IEEE/CAA Journal of Automatica Sinica.

[22]  A. Zeevi,et al.  Non-Stationary Stochastic Optimization , 2014 .

[23]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[24]  S. Janson Tail bounds for sums of geometric and exponential variables , 2017, 1709.08157.

[25]  Herna L. Viktor,et al.  Fast Hoeffding Drift Detection Method for Evolving Data Streams , 2016, ECML/PKDD.

[26]  Shie Mannor,et al.  Concept Drift Detection Through Resampling , 2014, ICML.

[27]  Yuan Yan Tang,et al.  Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift , 2017, IJCAI.

[28]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[29]  Leandro L. Minku,et al.  Diversity-Based Pool of Models for Dealing with Recurring Concepts , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[30]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[31]  Yu Sun,et al.  Concept Drift Adaptation by Exploiting Historical Knowledge , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Peter J. Haas,et al.  Online Model Management via Temporally Biased Sampling , 2019, SGMD.

[33]  Jinfeng Yi,et al.  Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient , 2016, ICML.