Learning in Nonstationary Environments: A Survey

The prevalence of mobile phones, the internet-of-things technology, and networks of sensors has led to an enormous and ever increasing amount of data that are now more commonly available in a streaming fashion [1]-[5]. Often, it is assumed - either implicitly or explicitly - that the process generating such a stream of data is stationary, that is, the data are drawn from a fixed, albeit unknown probability distribution. In many real-world scenarios, however, such an assumption is simply not true, and the underlying process generating the data stream is characterized by an intrinsic nonstationary (or evolving or drifting) phenomenon. The nonstationarity can be due, for example, to seasonality or periodicity effects, changes in the users' habits or preferences, hardware or software faults affecting a cyber-physical system, thermal drifts or aging effects in sensors. In such nonstationary environments, where the probabilistic properties of the data change over time, a non-adaptive model trained under the false stationarity assumption is bound to become obsolete in time, and perform sub-optimally at best, or fail catastrophically at worst.

[1]  Gavin Brown,et al.  Online Non-stationary Boosting , 2010, MCS.

[2]  Jesús Cid-Sueiro,et al.  Class and subclass probability re-estimation to adapt a classifier in the presence of concept drift , 2011, Neurocomputing.

[3]  John Yen,et al.  Tracking changes in user interests with a few relevance judgments , 2003, CIKM '03.

[4]  Jing Liu,et al.  Ambiguous decision trees for mining concept-drifting data streams , 2009, Pattern Recognit. Lett..

[5]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.

[6]  Padraig Cunningham,et al.  A Comparison of Ensemble and Case-Base Maintenance Techniques for Handling Concept Drift in Spam Filtering , 2006, FLAIRS.

[7]  Douglas M. Hawkins,et al.  The Changepoint Model for Statistical Process Control , 2003 .

[8]  Haibo He,et al.  Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach , 2011, Evol. Syst..

[9]  Ivor W. Tsang,et al.  The Emerging "Big Dimensionality" , 2014, IEEE Computational Intelligence Magazine.

[10]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[11]  Eric Eaton,et al.  Scalable Lifelong Learning with Active Task Selection , 2013, AAAI Spring Symposium: Lifelong Machine Learning.

[12]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[13]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[14]  Albert Bifet,et al.  Adaptive learning and mining for data streams and frequent patterns , 2009, SKDD.

[15]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[16]  Shie Mannor,et al.  Concept Drift Detection Through Resampling , 2014, ICML.

[17]  David B. Skillicorn,et al.  Classification Using Streaming Random Forests , 2011, IEEE Transactions on Knowledge and Data Engineering.

[18]  Georg Krempl,et al.  The Algorithm APT to Classify in Concurrence of Latency and Drift , 2011, IDA.

[19]  Koichiro Yamauchi,et al.  Detecting Concept Drift Using Statistical Testing , 2007, Discovery Science.

[20]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[21]  Robi Polikar,et al.  Quantifying the limited and gradual concept drift assumption , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[22]  Manoranjan Dash,et al.  A Test Paradigm for Detecting Changes in Transactional Data Streams , 2008, DASFAA.

[23]  Xin Yao,et al.  Resampling-Based Ensemble Methods for Online Class Imbalance Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[24]  Michèle Basseville,et al.  Detection of abrupt changes , 1993 .

[25]  Eric Eaton,et al.  ELLA: An Efficient Lifelong Learning Algorithm , 2013, ICML.

[26]  Philip S. Yu,et al.  Class-distribution regularized consensus maximization for alleviating overfitting in model combination , 2014, KDD.

[27]  Cesare Alippi,et al.  Just-in-Time Adaptive Classifiers—Part I: Detecting Nonstationary Changes , 2008, IEEE Transactions on Neural Networks.

[28]  Robi Polikar,et al.  Active learning in nonstationary environments , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[29]  Edith Cohen,et al.  Maintaining time-decaying stream aggregates , 2003, J. Algorithms.

[30]  Robi Polikar,et al.  COMPOSE: A Semisupervised Learning Framework for Initially Labeled Nonstationary Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Jan Peter Patist Optimal Window Change Detection , 2007 .

[32]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[33]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[35]  Indre Zliobaite,et al.  Change with Delayed Labeling: When is it Detectable? , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[36]  Abraham Kandel,et al.  Info-fuzzy algorithms for mining dynamic data streams , 2008, Appl. Soft Comput..

[37]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[38]  Georg Krempl,et al.  Drift mining in data: A framework for addressing drift in classification , 2013, Comput. Stat. Data Anal..

[39]  Marcus A. Maloof,et al.  Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts , 2007, J. Mach. Learn. Res..

[40]  Cesare Alippi,et al.  An adaptive CUSUM-based test for signal change detection , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[41]  Christian Sohler,et al.  StreamKM++: A clustering algorithm for data streams , 2010, JEAL.

[42]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[43]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[44]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[45]  Ivan Koychev,et al.  Gradual Forgetting for Adaptation to Concept Drift , 2000 .

[46]  Gregory Ditzler,et al.  Hellinger distance based drift detection for nonstationary environments , 2011, 2011 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (CIDUE).

[47]  Cesare Alippi,et al.  An effective just-in-time adaptive classifier for gradual concept drifts , 2011, The 2011 International Joint Conference on Neural Networks.

[48]  Geoff Hulten,et al.  A General Framework for Mining Massive Data Streams , 2003 .

[49]  Cesare Alippi,et al.  A hierarchical, nonparametric, sequential change-detection test , 2011, The 2011 International Joint Conference on Neural Networks.

[50]  Cesare Alippi,et al.  Intelligence for Embedded Systems , 2014 .

[51]  Cesare Alippi,et al.  Just in time classifiers: Managing the slow drift case , 2009, 2009 International Joint Conference on Neural Networks.

[52]  Yong Shi,et al.  A Regularized Multiple Criteria Linear Program for Classification , 2007 .

[53]  Gail A. Carpenter,et al.  ARTMAP: a self-organizing neural network architecture for fast supervised learning and pattern recognition , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[54]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[55]  Mykola Pechenizkiy,et al.  Dynamic integration of classifiers for handling concept drift , 2008, Inf. Fusion.

[56]  Stephen Grossberg,et al.  Nonlinear neural networks: Principles, mechanisms, and architectures , 1988, Neural Networks.

[57]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[58]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[59]  Xin Yao,et al.  Using Class Imbalance Learning for Software Defect Prediction , 2013, IEEE Transactions on Reliability.

[60]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[61]  Li Guo,et al.  Comparative study between incremental and ensemble learning on data streams: Case study , 2014, Journal Of Big Data.

[62]  Ralf Klinkenberg,et al.  An Ensemble Classifier for Drifting Concepts , 2005 .

[63]  Ricard Gavaldà,et al.  Kalman Filters and Adaptive Windows for Learning in Data Streams , 2006, Discovery Science.

[64]  Cesare Alippi,et al.  Just-in-Time Adaptive Classifiers—Part II: Designing the Classifier , 2008, IEEE Transactions on Neural Networks.

[65]  Geoff Holmes,et al.  Improving Adaptive Bagging Methods for Evolving Data Streams , 2009, ACML.

[66]  Pavlos Protopapas,et al.  Computational Intelligence Challenges and Applications on Large-Scale Astronomical Time Series Databases , 2014, IEEE Computational Intelligence Magazine.

[67]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[68]  Kun Zhang,et al.  Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling , 2014, SDM.

[69]  Gianmarco De Francisci Morales,et al.  SAMOA: scalable advanced massive online analysis , 2015, J. Mach. Learn. Res..

[70]  Haibo He,et al.  SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining , 2009, 2009 International Joint Conference on Neural Networks.

[71]  Gunnar Rätsch,et al.  An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis , 2008, NIPS.

[72]  Yun Sing Koh,et al.  One Pass Concept Change Detection for Data Streams , 2013, PAKDD.

[73]  Indre liobaite,et al.  Change with Delayed Labeling: When is it Detectable? , 2010, ICDM 2010.

[74]  Geoff Holmes,et al.  Active Learning With Drifting Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[75]  John Yen,et al.  An adaptive algorithm for learning changes in user interests , 1999, CIKM '99.

[76]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[77]  Gregory Ditzler,et al.  Domain adaptation bounds for multiple expert systems under concept drift , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[78]  Cesare Alippi,et al.  A just-in-time adaptive classification system based on the intersection of confidence intervals rule , 2011, Neural Networks.

[79]  William Nick Street,et al.  A streaming ensemble algorithm (SEA) for large-scale classification , 2001, KDD '01.

[80]  Georg Krempl,et al.  Classification in Presence of Drift and Latency , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[81]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[82]  Motoaki Kawanabe,et al.  Machine Learning in Non-Stationary Environments - Introduction to Covariate Shift Adaptation , 2012, Adaptive computation and machine learning.

[83]  Li Guo,et al.  Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams , 2010, 2010 IEEE International Conference on Data Mining.

[84]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[85]  Koichiro Yamauchi Incremental learning and model selection under virtual concept drifting environments , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[86]  Abraham Kandel,et al.  Real-time data mining of non-stationary data streams from sensor networks , 2008, Inf. Fusion.

[87]  Robi Polikar,et al.  Semi-supervised learning in initially labeled non-stationary environments with gradual drift , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[88]  Gregory Ditzler,et al.  Semi-supervised learning in nonstationary environments , 2011, The 2011 International Joint Conference on Neural Networks.

[89]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[90]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[91]  Ludmila I. Kuncheva,et al.  A framework for generating data to simulate changing environments , 2007, Artificial Intelligence and Applications.

[92]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[93]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[94]  Cesare Alippi,et al.  Just-in-time ensemble of classifiers , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[95]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[96]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[97]  José del Campo-Ávila,et al.  Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds , 2015, IEEE Transactions on Knowledge and Data Engineering.

[98]  Koichiro Yamauchi,et al.  Learning, detecting, understanding, and predicting concept changes , 2009, 2009 International Joint Conference on Neural Networks.

[99]  Stephen Grossberg,et al.  Art 2: Self-Organization Of Stable Category Recognition Codes For Analog Input Patterns , 1988, Other Conferences.

[100]  Gregory Ditzler,et al.  Transductive learning algorithms for nonstationary environments , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[101]  Gregory Ditzler,et al.  Discounted expert weighting for concept drift , 2013, 2013 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments (CIDUE).

[102]  Christian Sohler,et al.  StreamKM++: A Clustering Algorithms for Data Streams , 2010, Workshop on Algorithm Engineering and Experimentation.

[103]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[104]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[105]  Robi Polikar,et al.  Incremental learning in nonstationary environments with controlled forgetting , 2009, 2009 International Joint Conference on Neural Networks.

[106]  Graham J. Williams,et al.  Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum] , 2014, IEEE Computational Intelligence Magazine.

[107]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[108]  Grigorios Tsoumakas,et al.  Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification , 2011, IJCAI.

[109]  Gregory Ditzler,et al.  Incremental Learning of Concept Drift from Streaming Imbalanced Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[110]  P. Armitage Sequential Medical Trials , 1961, Biomedicine / [publiee pour l'A.A.I.C.I.G.].

[111]  Manuel Roveri,et al.  A cognitive monitoring system for contaminant detection in intelligent buildings , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[112]  Charu C. Aggarwal,et al.  On biased reservoir sampling in the presence of stream evolution , 2006, VLDB.

[113]  Yun Sing Koh,et al.  Detecting concept change in dynamic data streams , 2013, Machine Learning.

[114]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[115]  Gregory Ditzler,et al.  An ensemble based incremental learning framework for concept drift and class imbalance , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[116]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[117]  Cesare Alippi,et al.  Just-In-Time Classifiers for Recurrent Concepts , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[118]  Jerzy Stefanowski,et al.  Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[119]  Philip S. Yu,et al.  Classifying Data Streams with Skewed Class Distributions and Concept Drifts , 2008, IEEE Internet Computing.

[120]  Dimitris K. Tasoulis,et al.  Nonparametric Monitoring of Data Streams for Changes in Location and Scale , 2011, Technometrics.

[121]  R. Knight,et al.  Moving pictures of the human microbiome , 2011, Genome Biology.

[122]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[123]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[124]  Haibo He,et al.  MuSeRA: Multiple Selectively Recursive Approach towards imbalanced stream data mining , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[125]  Francesco Piazza,et al.  Online sequential extreme learning machine in nonstationary environments , 2013, Neurocomputing.

[126]  Cesare Alippi,et al.  Change detection tests using the ICI rule , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[127]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[128]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[129]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[130]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[131]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[132]  Yizhou Sun,et al.  Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models , 2009, NIPS.