On Taxonomy and Evaluation of Feature Selection‐Based Learning Classifier System Ensemble Approaches for Data Mining Problems

Ensemble methods aim at combining multiple learning machines to improve the efficacy in a learning task in terms of prediction accuracy, scalability, and other measures. These methods have been applied to evolutionary machine learning techniques including learning classifier systems (LCSs). In this article, we first propose a conceptual framework that allows us to appropriately categorize ensemble‐based methods for fair comparison and highlights the gaps in the corresponding literature. The framework is generic and consists of three sequential stages: a pre‐gate stage concerned with data preparation; the member stage to account for the types of learning machines used to build the ensemble; and a post‐gate stage concerned with the methods to combine ensemble output. A taxonomy of LCSs‐based ensembles is then presented using this framework. The article then focuses on comparing LCS ensembles that use feature selection in the pre‐gate stage. An evaluation methodology is proposed to systematically analyze the performance of these methods. Specifically, random feature sampling and rough set feature selection‐based LCS ensemble methods are compared. Experimental results show that the rough set‐based approach performs significantly better than the random subspace method in terms of classification accuracy in problems with high numbers of irrelevant features. The performance of the two approaches are comparable in problems with high numbers of redundant features.

[1]  Yang Gao,et al.  LCSE: Learning Classifier System Ensemble for Incremental Medical Instances , 2005, IWLCS.

[2]  Larry Bull,et al.  Accuracy-based Neuro And Neuro-fuzzy Classifier Systems , 2002, GECCO.

[3]  Martin V. Butz,et al.  Data Mining in Learning Classifier Systems: Comparing XCS with GAssist , 2005, IWLCS.

[4]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Xavier Llorà,et al.  XCS and GALE: A Comparative Study of Two Learning Classifier Systems on Data Mining , 2001, IWLCS.

[6]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[7]  Yang Gao,et al.  Learning classifier system ensemble for data mining , 2005, GECCO '05.

[8]  Ujjwal Maulik,et al.  Identifying Potential Gene Markers Using SVM Classifier Ensemble , 2010 .

[9]  Michael Kirley,et al.  CoXCS: A Coevolutionary Learning Classifier Based on Feature Space Partitioning , 2009, Australasian Conference on Artificial Intelligence.

[10]  Daniele Loiacono,et al.  Evolving Classifiers Ensembles with Heterogeneous Predictors , 2008, IWLCS.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Núria Macià,et al.  Preliminary approach on synthetic data sets generation based on class separability measure , 2008, 2008 19th International Conference on Pattern Recognition.

[13]  Kathryn E. Merrick,et al.  Reduct based ensemble of learning classifier system for real-valued classification problems , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[14]  Stewart W. Wilson Mining Oblique Data with XCS , 2000, IWLCS.

[15]  Kathryn E. Merrick,et al.  Performance analysis of rough set ensemble of learning classifier systems with differential evolution based rule discovery , 2013, Evol. Intell..

[16]  Peter Bühlmann,et al.  Bagging, Boosting and Ensemble Methods , 2012 .

[17]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[18]  Hussein A. Abbass,et al.  A Self-Organized, Distributed, and Adaptive Rule-Based Induction System , 2009, IEEE Transactions on Neural Networks.

[19]  Torsten Hothorn,et al.  Model-based Boosting 2.0 , 2010, J. Mach. Learn. Res..

[20]  Zhonghui Xu,et al.  Z‐BAG: A CLASSIFICATION ENSEMBLE SYSTEM WITH POSTERIOR PROBABILISTIC OUTPUTS , 2013, Comput. Intell..

[21]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[22]  Krzysztof Pancerz,et al.  Approximate Petri Nets for Rule-Based Decision Making , 2004, Rough Sets and Current Trends in Computing.

[23]  Yang Gao,et al.  Ensemble Learning Classifier System and Compact Ruleset , 2006, SEAL.

[24]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[25]  Kagan Tumer,et al.  Input decimated ensembles , 2003, Pattern Analysis & Applications.

[26]  Xin Yao,et al.  Neural-Based Learning Classifier Systems , 2008, IEEE Transactions on Knowledge and Data Engineering.

[27]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  Jaume Bacardit,et al.  Empirical Evaluation of Ensemble Techniques for a Pittsburgh Learning Classifier System , 2007, IWLCS.

[30]  Larry Bull,et al.  On the use of rule-sharing in learning classifier system ensembles , 2005, 2005 IEEE Congress on Evolutionary Computation.

[31]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[32]  Dikai Liu,et al.  Distributed classifier migration in xcs for classification of electroencephalographic signals , 2007, 2007 IEEE Congress on Evolutionary Computation.

[33]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[34]  Anne M. P. Canuto,et al.  A comparative analysis of genetic algorithm and ant colony optimization to select attributes for an heterogeneous ensemble of classifiers , 2010, IEEE Congress on Evolutionary Computation.

[35]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[36]  Sonia Schulenburg,et al.  Collective behavior based hierarchical XCS , 2007, GECCO '07.

[37]  David E. Goldberg,et al.  The Design of Innovation: Lessons from and for Competent Genetic Algorithms , 2002 .

[38]  Remco R. Bouckaert,et al.  Accuracy bounds for ensembles under 0 - 1 loss. , 2002 .

[39]  Steven Guan,et al.  Cooperative co-evolution of GA-based classifiers based on input decomposition , 2008, Eng. Appl. Artif. Intell..

[40]  Daniele Loiacono,et al.  Evolving classifier ensembles with voting predictors , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[41]  Hussein A. Abbass,et al.  Intrusion detection with evolutionary learning classifier systems , 2009, Natural Computing.

[42]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[43]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[45]  Federico Divina,et al.  Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features , 2012, Bioinform..

[46]  Albert Y. Zomaya,et al.  A Review of Ensemble Methods in Bioinformatics , 2010, Current Bioinformatics.