Fuzzy competence model drift detection for data-driven decision support systems

Abstract This paper focuses on concept drift in business intelligence and data-driven decision support systems (DSSs). The assumption of a fixed distribution in the data renders conventional static DSSs inaccurate and unable to make correct decisions when concept drift occurs. However, it is important to know when, how, and where concept drift occurs so a DSS can adjust its decision processing knowledge to adapt to an ever-changing environment at the appropriate time. This paper presents a data distribution-based concept drift detection method called fuzzy competence model drift detection (FCM-DD). By introducing fuzzy sets theory and replacing crisp boundaries with fuzzy ones, we have improved the competence model to provide a better, more refined empirical distribution of the data stream. FCM-DD requires no prior knowledge of the underlying distribution and provides statistical guarantee of the reliability of the detected drift, based on the theory of bootstrapping. A series of experiments show that our proposed FCM-DD method can detect drift more accurately, has good sensitivity, and is robust.

[1]  Stefan Kramer,et al.  Prototype-based learning on concept-drifting data streams , 2014, KDD.

[2]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[3]  Ralf Klinkenberg,et al.  Learning drifting concepts: Example selection vs. example weighting , 2004, Intell. Data Anal..

[4]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[5]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[6]  Jie Lu,et al.  A novel weighting method for online ensemble learning with the presence of concept drift , 2014 .

[7]  S. Haneuse,et al.  On the Assessment of Monte Carlo Error in Simulation-Based Statistical Analyses , 2009, The American statistician.

[8]  Wei Xu,et al.  Modeling concept drift from the perspective of classifiers , 2008, 2008 IEEE Conference on Cybernetics and Intelligent Systems.

[9]  Cesare Alippi,et al.  Just-In-Time Classifiers for Recurrent Concepts , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Barry Smyth,et al.  Modelling the Competence of Case-Bases , 1998, EWCBR.

[11]  Ning Lu,et al.  Concept drift detection via competence models , 2014, Artif. Intell..

[12]  Igor Vajda,et al.  On Divergences and Informations in Statistics and Information Theory , 2006, IEEE Transactions on Information Theory.

[13]  Nan Liu,et al.  Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift , 2015, Neurocomputing.

[14]  Simon Fong,et al.  Countering the concept-drift problems in big data by an incrementally optimized stream mining model , 2015, J. Syst. Softw..

[15]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[16]  Stewart Massie,et al.  What is CBR Competence , .

[17]  Michael J. Shaw,et al.  Learning-enhanced adaptive DSS: a Design Science perspective , 2009, Inf. Technol. Manag..

[18]  Ernestina Menasalvas Ruiz,et al.  Learning recurring concepts from data streams with a context-aware ensemble , 2011, SAC.

[19]  Dimitris K. Tasoulis,et al.  Exponentially weighted moving average charts for detecting concept drift , 2012, Pattern Recognit. Lett..

[20]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[21]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[22]  Ning Lu,et al.  A concept drift-tolerant case-base editing technique , 2016, Artif. Intell..

[23]  Bartosz Krawczyk,et al.  One-class classifiers with incremental learning and forgetting for data streams with concept drift , 2015, Soft Comput..

[24]  S. Venkatasubramanian,et al.  An Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data Streams , 2006 .

[25]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[26]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[27]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[28]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[29]  Robert LIN,et al.  NOTE ON FUZZY SETS , 2014 .

[30]  João Gama,et al.  Change Detection in Learning Histograms from Data Streams , 2007, EPIA Workshops.

[31]  Abraham Bernstein,et al.  Entropy-based Concept Shift Detection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[32]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[33]  Barry Smyth,et al.  Remembering To Forget: A Competence-Preserving Case Deletion Policy for Case-Based Reasoning Systems , 1995, IJCAI.

[34]  Indre Zliobaite,et al.  Combining similarity in time and space for training set formation under concept drift , 2011, Intell. Data Anal..

[35]  João Gama,et al.  Decision trees for mining data streams , 2006, Intell. Data Anal..

[36]  Koichiro Yamauchi,et al.  Detecting Concept Drift Using Statistical Testing , 2007, Discovery Science.