Feature Selection Inspired Classifier Ensemble Reduction

Classifier ensembles constitute one of the main research directions in machine learning and data mining. The use of multiple classifiers generally allows better predictive performance than that achievable with a single model. Several approaches exist in the literature that provide means to construct and aggregate such ensembles. However, these ensemble systems contain redundant members that, if removed, may further increase group diversity and produce better results. Smaller ensembles also relax the memory and storage requirements, reducing system's run-time overhead while improving overall efficiency. This paper extends the ideas developed for feature selection problems to support classifier ensemble reduction, by transforming ensemble predictions into training samples, and treating classifiers as features. Also, the global heuristic harmony search is used to select a reduced subset of such artificial features, while attempting to maximize the feature subset evaluation. The resulting technique is systematically evaluated using high dimensional and large sized benchmark datasets, showing a superior classification performance against both original, unreduced ensembles, and randomly formed subsets.

[1]  Jakub Wroblewski,et al.  Ensembles of Classifiers Based on Approximate Reducts , 2001, Fundam. Informaticae.

[2]  Loris Nanni,et al.  Ensemblator: An ensemble of classifiers for reliable classification of biological data , 2007, Pattern Recognit. Lett..

[3]  Chun-Nan Hsu,et al.  The ANNIGMA-wrapper approach to fast feature selection for neural nets , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[4]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[5]  Yoshua Bengio,et al.  Série Scientifique Scientific Series No Unbiased Estimator of the Variance of K-fold Cross-validation No Unbiased Estimator of the Variance of K-fold Cross-validation , 2022 .

[6]  Hiroshi Motoda,et al.  Book Review: Computational Methods of Feature Selection , 2007, The IEEE intelligent informatics bulletin.

[7]  Grigorios Tsoumakas,et al.  Instance-Based Ensemble Pruning via Multi-Label Classification , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[8]  Witold Pedrycz,et al.  A Tabu–Harmony Search-Based Approach to Fuzzy Linear Regression , 2011, IEEE Transactions on Fuzzy Systems.

[9]  R. Boggia,et al.  Genetic algorithms as a strategy for feature selection , 1992 .

[10]  Carme Torras,et al.  Assessing Image Features for Vision-Based Robot Positioning , 2001, J. Intell. Robotic Syst..

[11]  Richard Jensen,et al.  Measures for Unsupervised Fuzzy-Rough Feature Selection , 2009, ISDA.

[12]  Qiang Shen,et al.  Computational Intelligence and Feature Selection - Rough and Fuzzy Approaches , 2008, IEEE Press series on computational intelligence.

[13]  Josef Kittler,et al.  Multilabel classification using heterogeneous ensemble of multi-label classifiers , 2012, Pattern Recognit. Lett..

[14]  Qiang Shen,et al.  Facilitating efficient Mars terrain image classification with fuzzy-rough feature selection , 2011, Int. J. Hybrid Intell. Syst..

[15]  Jonathan Lawry,et al.  A linguistic decision tree approach to predicting storm surge , 2013, Fuzzy Sets Syst..

[16]  K. Lee,et al.  A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice , 2005 .

[17]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[18]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[19]  Fabio Roli,et al.  An approach to the automatic design of multiple classifier systems , 2001, Pattern Recognit. Lett..

[20]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[21]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2007 .

[22]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[23]  L.E. Parker,et al.  Design and performance improvements for fault detection in tightly-coupled multi-robot team tasks , 2008, IEEE SoutheastCon 2008.

[24]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[25]  M. Mahdavi,et al.  ARTICLE IN PRESS Available online at www.sciencedirect.com , 2007 .

[26]  Ludmila I. Kuncheva,et al.  Switching between selection and fusion in combining classifiers: an experiment , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[27]  Grigorios Tsoumakas,et al.  Pruning an ensemble of classifiers via reinforcement learning , 2009, Neurocomputing.

[28]  Morteza Haghir Chehreghani,et al.  Novel meta-heuristic algorithms for clustering web documents , 2008, Appl. Math. Comput..

[29]  Qiang Shen,et al.  Feature Selection With Harmony Search , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[30]  Ben J. A. Kröse,et al.  A probabilistic model for appearance-based robot localization , 2001, Image and Vision Computing.

[31]  Trevor Darrell,et al.  Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[32]  Richard Jensen,et al.  Measures for Unsupervised Fuzzy-Rough Feature Selection , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[33]  Manuela M. Veloso,et al.  Feature selection for activity recognition in multi-robot domains , 2008, AAAI 2008.

[34]  A. Marín-Hernández,et al.  Significant Feature Selection in Range Scan Data for Geometrical Mobile Robot Mapping , 2006 .

[35]  Mario Marchand,et al.  Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[37]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[38]  John Q. Gan,et al.  Constructing accurate and parsimonious fuzzy models with distinguishable fuzzy sets based on an entropy measure , 2006, Fuzzy Sets Syst..

[39]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[40]  Qiang Shen,et al.  A Distance Measure Approach to Exploring the Rough Set Boundary Region for Attribute Reduction , 2010, IEEE Transactions on Knowledge and Data Engineering.

[41]  Bernard Zenko,et al.  Is Combining Classifiers Better than Selecting the Best One , 2002, ICML.

[42]  Qiang Shen,et al.  Fuzzy-rough classifier ensemble selection , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[43]  Rajen B. Bhatt,et al.  On fuzzy-rough sets approach to feature selection , 2005, Pattern Recognit. Lett..

[44]  John Q. Gan,et al.  Constructing L2-SVM-Based Fuzzy Classifiers in High-Dimensional Space With Automatic Model Selection and Fuzzy Rule Ranking , 2007, IEEE Transactions on Fuzzy Systems.

[45]  Zong Woo Geem,et al.  Recent Advances In Harmony Search Algorithm , 2010, Recent Advances In Harmony Search Algorithm.

[46]  Changjing Shang,et al.  Fuzzy-rough feature selection aided support vector machines for Mars image classification , 2013, Comput. Vis. Image Underst..

[47]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[48]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[49]  Justin C. W. Debuse,et al.  Feature Subset Selection within a Simulated Annealing Data Mining Algorithm , 1997, Journal of Intelligent Information Systems.

[50]  M. Raju,et al.  Optimal Network Reconfiguration of Large-Scale Distribution System Using Harmony Search Algorithm , 2011, IEEE Transactions on Power Systems.

[51]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[52]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[53]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[54]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[55]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.

[56]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[57]  João Paulo Papa,et al.  A novel algorithm for feature selection using Harmony Search and its application for non-technical losses detection , 2011, Comput. Electr. Eng..

[58]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[59]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[60]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[61]  Chris Cornelis,et al.  Fuzzy-Rough Nearest Neighbour Classification , 2011, Trans. Rough Sets.