Ensembles of Heterogeneous Concept Drift Detectors - Experimental Study

For the contemporary enterprises, possibility of appropriate business decision making on the basis of the knowledge hidden in stored data is the critical success factor. Therefore, the decision support software should take into consideration that data usually comes continuously in the form of so-called data stream, but most of the traditional data analysis methods are not ready to efficiently analyze fast growing amount of the stored records. Additionally, one should also consider phenomenon appearing in data stream called concept drift, which means that the parameters of an using model are changing, what could dramatically decrease the analytical model quality. This work is focusing on the classification task, which is very popular in many practical cases as fraud detection, network security, or medical diagnosis. We propose how to detect the changes in the data stream using combined concept drift detection model. The experimental evaluations confirm its pretty good quality, what encourage us to use it in practical applications.

[1]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[2]  Tao Wang,et al.  Study on the classification of data streams with concept drift , 2011, 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[3]  Ludmila I. Kuncheva,et al.  Clustering-and-selection model for classifier combination , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[4]  Michal Wozniak,et al.  Concept Drift Detection and Model Selection with Simulated Recurrence and Ensembles of Statistical Detectors , 2013, J. Univers. Comput. Sci..

[5]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[6]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[7]  Šarūnas Raudys,et al.  Statistical and Neural Classifiers: An Integrated Approach to Design , 2012 .

[8]  Ludmila I. Kuncheva,et al.  Classifier Ensembles for Changing Environments , 2004, Multiple Classifier Systems.

[9]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[10]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[11]  Dan Roth,et al.  Learning cost-sensitive active classifiers , 2002, Artif. Intell..

[12]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[13]  Geoff Holmes,et al.  CD-MOA: Change Detection Framework for Massive Online Analysis , 2013, IDA.

[14]  Gerhard Widmer,et al.  Effective Learning in Dynamic Environments by Explicit Context Tracking , 1993, ECML.

[15]  José del Campo-Ávila,et al.  Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds , 2015, IEEE Transactions on Knowledge and Data Engineering.

[16]  Bartosz Krawczyk,et al.  One-class classifier ensemble pruning and weighting with firefly algorithm , 2015, Neurocomputing.

[17]  Richard Granger,et al.  Incremental Learning from Noisy Data , 1986, Machine Learning.

[18]  Plamen P. Angelov,et al.  Handling drifts and shifts in on-line data streams with evolving fuzzy systems , 2011, Appl. Soft Comput..

[19]  Bartosz Krawczyk,et al.  Improved Adaptive Splitting and Selection: the Hybrid Training Method of a Classifier Based on a Feature Space Partitioning , 2014, Int. J. Neural Syst..

[20]  Michal Wozniak,et al.  A hybrid decision tree training method using data streams , 2011, Knowledge and Information Systems.

[21]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[22]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[23]  Shie Mannor,et al.  Concept Drift Detection Through Resampling , 2014, ICML.

[24]  Fredrik Gustafsson,et al.  Adaptive filtering and change detection , 2000 .