Ensembles for Feature Selection

This chapter describes the ideas of the ensemble approach applied to feature selection, a classical preprocessing step which in the present context of Big Data and high dimensional datasets, has become of capital importance. Section 4.1 introduces the context of ensembles for feature selection, that are more detailed in Sects. 4.2 and 4.3 for homogeneous and heterogeneous ensembles, respectively. In both sections, a use case using rankers is employed to illustrate the concepts, in Sects. 4.2.1 and 4.3.1. Finally, in Sect. 4.4, a brief comparison between the results obtained by both approaches employed in the use cases is shown, with the aim of giving the readers a brief guideline of their better use.

[1]  Verónica Bolón-Canedo,et al.  Centralized vs. distributed feature selection methods based on data complexity measures , 2017, Knowl. Based Syst..

[2]  Verónica Bolón-Canedo,et al.  Data classification using an ensemble of filters , 2014, Neurocomputing.

[3]  Terry Windeatt,et al.  Embedded Feature Ranking for Ensemble MLP Classifiers , 2011, IEEE Transactions on Neural Networks.

[4]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Learning from Imbalanced Data Using Random Forest , 2007 .

[5]  Verónica Bolón-Canedo,et al.  An ensemble of filters and classifiers for microarray data classification , 2012, Pattern Recognit..

[6]  J. Tukey Comparing individual means in the analysis of variance. , 1949, Biometrics.

[7]  T. Ho,et al.  Data Complexity in Pattern Recognition , 2006 .

[8]  Verónica Bolón-Canedo,et al.  Recent advances and emerging challenges of feature selection in the context of big data , 2015, Knowl. Based Syst..

[9]  Chi-Chun Huang,et al.  A Novel GA-Taguchi-Based Feature Selection Method , 2008, IDEAL.

[10]  Verónica Bolón-Canedo,et al.  Ensemble feature selection: Homogeneous and heterogeneous approaches , 2017, Knowl. Based Syst..

[11]  Giandomenico Spezzano,et al.  An Adaptive Distributed Ensemble Approach to Mine Concept-Drifting Data Streams , 2007 .

[12]  Feng Yang,et al.  Robust Feature Selection for Microarray Data Based on Multicriterion Fusion , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[14]  David W. Opitz,et al.  Feature Selection for Ensembles , 1999, AAAI/IAAI.

[15]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[16]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..

[17]  Max Bramer,et al.  Principles of Data Mining , 2013, Undergraduate Topics in Computer Science.

[18]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[19]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[20]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[21]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[22]  Sunanda Das,et al.  Ensemble feature selection using bi-objective genetic algorithm , 2017, Knowl. Based Syst..

[23]  Verónica Bolón-Canedo,et al.  Testing Different Ensemble Configurations for Feature Selection , 2017, Neural Processing Letters.

[24]  Mohamed Limam,et al.  Robust ensemble feature selection for high dimensional data sets , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).

[25]  Sabela Ramos,et al.  Multithreaded and Spark parallelization of feature selection filters , 2016, J. Comput. Sci..

[26]  S. B. Lyerly The average spearman rank correlation coefficient , 1952 .

[27]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[28]  Taghi M. Khoshgoftaar,et al.  A Comparative Study of Ensemble Feature Selection Techniques for Software Defect Prediction , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[29]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[30]  Terry Windeatt,et al.  Stopping Criteria for Ensemble-Based Feature Selection , 2007, MCS.

[31]  Douglas W. Oard,et al.  Combining feature selectors for text classification , 2006, CIKM '06.

[32]  Steve R. Gunn,et al.  Ensemble Algorithms for Feature Selection , 2004, Deterministic and Statistical Methods in Machine Learning.

[33]  Taghi M. Khoshgoftaar,et al.  Ensemble Feature Selection Technique for Software Quality Classification , 2010, International Conference on Software Engineering and Knowledge Engineering.