An Adaptive Heterogeneous Online Learning Ensemble Classifier for Nonstationary Environments

In recent years, the prevalence of technological advances has led to an enormous and ever-increasing amount of data that are now commonly available in a streaming fashion. In such nonstationary environments, the underlying process generating the data stream is characterized by an intrinsic nonstationary or evolving or drifting phenomenon known as concept drift. Given the increasingly common applications whose data generation mechanisms are susceptible to change, the need for effective and efficient algorithms for learning from and adapting to evolving or drifting environments can hardly be overstated. In dynamic environments associated with concept drift, learning models are frequently updated to adapt to changes in the underlying probability distribution of the data. A lot of work in the area of learning in nonstationary environments focuses on updating the learning predictive model to optimize recovery from concept drift and convergence to new concepts by adjusting parameters and discarding poorly performing models while little effort has been dedicated to investigate what type of learning model is suitable at any given time for different types of concept drift. In this paper, we investigate the impact of heterogeneous online ensemble learning based on online model selection for predictive modeling in dynamic environments. We propose a novel heterogeneous ensemble approach based on online dynamic ensemble selection that accurately interchanges between different types of base models in an ensemble to enhance its predictive performance in nonstationary environments. The approach is known as Heterogeneous Dynamic Ensemble Selection based on Accuracy and Diversity (HDES-AD) and makes use of models generated by different base learners to increase diversity to circumvent problems associated with existing dynamic ensemble classifiers that may experience loss of diversity due to the exclusion of base learners generated by different base algorithms. The algorithm is evaluated on artificial and real-world datasets with well-known online homogeneous online ensemble approaches such as DDD, AFWE, and OAUE. The results show that HDES-AD performed significantly better than the other three homogeneous online ensemble approaches in nonstationary environments.

[1]  Sung-Bae Cho,et al.  A hybrid genetic based functional link artificial neural network with a statistical comparison of classifiers over multiple datasets , 2010, Neural Computing and Applications.

[2]  G. Yule On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .

[3]  Gregory Ditzler,et al.  Learning in Nonstationary Environments: A Survey , 2015, IEEE Computational Intelligence Magazine.

[4]  Michael J. Procopio,et al.  An experimental analysis of classifier ensembles for learning drifting concepts over time in autonomous outdoor robot navigation , 2007 .

[5]  Alan Wee-Chung Liew,et al.  Heterogeneous classifier ensemble with fuzzy rule-based meta learner , 2018, Inf. Sci..

[6]  Petr Hájek,et al.  Two-stage consumer credit risk modelling using heterogeneous ensemble learning , 2019, Decis. Support Syst..

[7]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  MetaStream: A meta-learning based method for periodic algorithm selection in time-changing data , 2014, Neurocomputing.

[8]  Aytug Onan,et al.  Ensemble of keyword extraction methods and classifiers in text classification , 2016, Expert Syst. Appl..

[9]  Mads Haahr,et al.  A Case-Based Approach to Spam Filtering that Can Track Concept Drift , 2003 .

[10]  Aytug Onan,et al.  Hybrid supervised clustering based ensemble scheme for text classification , 2017, Kybernetes.

[11]  Aytug Onan,et al.  An ensemble scheme based on language function analysis and feature engineering for text genre classification , 2018, J. Inf. Sci..

[12]  Mahardhika Pratama,et al.  Autonomous Deep Learning: Continual Learning Approach for Dynamic Environments , 2018, SDM.

[13]  Leandro L. Minku,et al.  A heterogeneous online learning ensemble for non-stationary environments , 2020, Knowl. Based Syst..

[14]  Bartosz Krawczyk,et al.  Active and adaptive ensemble learning for online activity recognition from data streams , 2017, Knowl. Based Syst..

[15]  Tianyou Chai,et al.  Heterogeneous Ensemble-Based Infill Criterion for Evolutionary Multiobjective Optimization of Expensive Problems , 2019, IEEE Transactions on Cybernetics.

[16]  Jerzy Stefanowski,et al.  Combining block-based and online methods in learning ensembles from concept drifting data streams , 2014, Inf. Sci..

[17]  Mark Last,et al.  Online classification of nonstationary data streams , 2002, Intell. Data Anal..

[18]  Kan Li,et al.  Active Fuzzy Weighting Ensemble for Dealing with Concept Drift , 2018, Int. J. Comput. Intell. Syst..

[19]  Weiping Ding,et al.  Automatic Construction of Multi-layer Perceptron Network from Streaming Examples , 2019, CIKM.

[20]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[21]  Xin Yao,et al.  DDD: A New Ensemble Approach for Dealing with Concept Drift , 2012, IEEE Transactions on Knowledge and Data Engineering.