Ensembles for feature selection: A review and future trends

Abstract Ensemble learning is a prolific field in Machine Learning since it is based on the assumption that combining the output of multiple models is better than using a single model, and it usually provides good results. Normally, it has been commonly employed for classification, but it can be used to improve other disciplines such as feature selection. Feature selection consists of selecting the relevant features for a problem and discard those irrelevant or redundant, with the main goal of improving classification accuracy. In this work, we provide the reader with the basic concepts necessary to build an ensemble for feature selection, as well as reviewing the up-to-date advances and commenting on the future trends that are still to be faced.

[1]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[2]  Ameet Talwalkar,et al.  MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..

[3]  Verónica Bolón-Canedo,et al.  On developing an automatic threshold applied to feature selection ensembles , 2018, Inf. Fusion.

[4]  Terry Windeatt,et al.  Embedded Feature Ranking for Ensemble MLP Classifiers , 2011, IEEE Transactions on Neural Networks.

[5]  Verónica Bolón-Canedo,et al.  Feature selection for high-dimensional data , 2016, Progress in Artificial Intelligence.

[6]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[7]  Jian Yang,et al.  Two-dimensional discriminant transform for face recognition , 2005, Pattern Recognit..

[8]  Verónica Bolón-Canedo,et al.  Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset , 2011, Expert Syst. Appl..

[9]  Verónica Bolón-Canedo,et al.  Using a feature selection ensemble on DNA microarray datasets , 2016, ESANN.

[10]  Nasser Kehtarnavaz,et al.  An affine invariant curve matching method for photo-identification of marine mammals , 2005, Pattern Recognit..

[11]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[12]  Bilwaj Gaonkar,et al.  Feature ranking based nested support vector machine ensemble for medical image classification , 2012, 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI).

[13]  Verónica Bolón-Canedo,et al.  An ensemble of filters and classifiers for microarray data classification , 2012, Pattern Recognit..

[14]  Mohd Salman Leong,et al.  An improved wrapper-based feature selection method for machinery fault diagnosis , 2017, PloS one.

[15]  Piotr Duda,et al.  How to adjust an ensemble size in stream data mining? , 2017, Inf. Sci..

[16]  Thierry Bouwmans,et al.  Superpixel-based online wagging one-class ensemble for feature selection in foreground/background separation , 2017, Pattern Recognit. Lett..

[17]  Sunanda Das,et al.  Ensemble feature selection using bi-objective genetic algorithm , 2017, Knowl. Based Syst..

[18]  Verónica Bolón-Canedo,et al.  Ensemble feature selection: Homogeneous and heterogeneous approaches , 2017, Knowl. Based Syst..

[19]  Verónica Bolón-Canedo,et al.  Testing Different Ensemble Configurations for Feature Selection , 2017, Neural Processing Letters.

[20]  Mohamed Limam,et al.  Robust ensemble feature selection for high dimensional data sets , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).

[21]  Sabela Ramos,et al.  Multithreaded and Spark parallelization of feature selection filters , 2016, J. Comput. Sci..

[22]  A. R. Nadira Banu Kamal,et al.  Ensemble Merit Merge Feature Selection for Enhanced Multinomial Classification in Alzheimer's Dementia , 2015, Comput. Math. Methods Medicine.

[23]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[24]  Haytham Elghazel,et al.  A semi-supervised feature ranking method with ensemble learning , 2012, Pattern Recognit. Lett..

[25]  Hamed R. Bonab,et al.  Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[26]  D. S. Guru,et al.  Ensemble of Feature Selection Methods for Text Classification: An Analytical Study , 2017, ISDA.

[27]  Verónica Bolón-Canedo,et al.  Distributed feature selection: An application to microarray data classification , 2015, Appl. Soft Comput..

[28]  Sven Laur,et al.  Robust rank aggregation for gene list integration and meta-analysis , 2012, Bioinform..

[29]  Pablo M. Granitto,et al.  Neural network ensembles: evaluation of aggregation algorithms , 2005, Artif. Intell..

[30]  Hamed R. Bonab,et al.  A Theoretical Framework on the Ideal Number of Classifiers for Online Ensembles in Data Streams , 2016, CIKM.

[31]  Verónica Bolón-Canedo,et al.  Recent Advances in Ensembles for Feature Selection , 2018, Intelligent Systems Reference Library.

[32]  Young-Koo Lee,et al.  Confident wrapper-type semi-supervised feature selection using an ensemble classifier , 2011, 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC).

[33]  Jing Wang,et al.  A survey on online feature selection with streaming features , 2018, Frontiers of Computer Science.

[34]  Anne M. P. Canuto,et al.  Investigating the influence of the choice of the ensemble members in accuracy and diversity of selection-based and fusion-based methods for ensembles , 2007, Pattern Recognit. Lett..

[35]  Gavin Brown,et al.  Measuring the Stability of Feature Selection with Applications to Ensemble Methods , 2015, MCS.

[36]  Taghi M. Khoshgoftaar,et al.  Robustness of Threshold-Based Feature Rankers with Data Sampling on Noisy and Imbalanced Data , 2012, FLAIRS.

[37]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[38]  Richard Zuech,et al.  A Survey on Feature Selection for Intrusion Detection , 2015 .

[39]  Gavin Brown,et al.  Modular Autoencoders for Ensemble Feature Extraction , 2015, FE@NIPS.

[40]  Bogdan E. Popescu,et al.  PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.

[41]  Ellen M. Voorhees,et al.  Evaluation by highly relevant documents , 2001, SIGIR '01.

[42]  José Ramón Quevedo,et al.  Using ensembles for problems with characterizable changes in data distribution: A case study on quantification , 2017, Inf. Fusion.

[43]  Joshua Zhexue Huang,et al.  Incremental density-based ensemble clustering over evolving data streams , 2016, Neurocomputing.

[44]  Verónica Bolón-Canedo,et al.  Paving the way for providing teaching feedback in automatic evaluation of open response assignments , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[45]  Regina Berretta,et al.  Heterogeneous Ensemble Combination Search Using Genetic Algorithm for Class Imbalanced Data Classification , 2016, PloS one.

[46]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[47]  Thiago J. M. Moura,et al.  Combining diversity measures for ensemble pruning , 2016, Pattern Recognit. Lett..

[48]  Morteza Zadimoghaddam,et al.  Scalable Feature Selection via Distributed Diversity Maximization , 2017, AAAI.

[49]  Chih-Fong Tsai,et al.  Clustering-based undersampling in class-imbalanced data , 2017, Inf. Sci..

[50]  Loris Nanni,et al.  Ensemble of texture descriptors and classifiers for face recognition , 2017 .

[51]  Vadlamani Ravi,et al.  Text Classification Using Ensemble Features Selection and Data Mining Techniques , 2014, SEMCCO.

[52]  Yvan Saeys,et al.  Discriminative and informative features for biomolecular text mining with ensemble feature selection , 2010, Bioinform..

[53]  Jong-Myon Kim,et al.  Feature selection techniques for increasing reliability of fault diagnosis of bearings , 2016, 2016 9th International Conference on Electrical and Computer Engineering (ICECE).

[54]  Zhe Li,et al.  Adaptive Ensemble Undersampling-Boost: A novel learning framework for imbalanced data , 2017, J. Syst. Softw..

[55]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[56]  Zhi-Hua Zhou,et al.  On the Size of Training Set and the Benefit from Ensemble , 2004, PAKDD.

[57]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[58]  John Yearwood,et al.  A Hybrid Feature Selection With Ensemble Classification for Imbalanced Healthcare Data: A Case Study for Brain Tumor Diagnosis , 2016, IEEE Access.

[59]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..

[60]  Yuchou Chang,et al.  Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm , 2008, Pattern Recognit..

[61]  Jemal H. Abawajy,et al.  Using feature selection for intrusion detection system , 2012, 2012 International Symposium on Communications and Information Technologies (ISCIT).

[62]  Haiping Li,et al.  Fault Diagnosis for Machinery based on Feature Selection and Probabilistic Neural Network , 2017 .

[63]  Md. Al Mehedi Hasan,et al.  Feature Selection for Intrusion Detection Using Random Forest , 2016 .

[64]  Xindong Wu,et al.  Towards Scalable and Accurate Online Feature Selection for Big Data , 2014, 2014 IEEE International Conference on Data Mining.

[65]  Jonathan Goh,et al.  A hybrid evolutionary algorithm for feature and ensemble selection in image tampering detection , 2015, Int. J. Electron. Secur. Digit. Forensics.

[66]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[67]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[68]  Yuxing Peng,et al.  A subspace ensemble framework for classification with high dimensional missing data , 2016, Multidimensional Systems and Signal Processing.

[69]  Feng Yang,et al.  Robust Feature Selection for Microarray Data Based on Multicriterion Fusion , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[70]  Lei Liu,et al.  Ensemble gene selection by grouping for microarray data classification , 2010, J. Biomed. Informatics.

[71]  Bassem A. Hassan,et al.  Gene prioritization through genomic data fusion , 2006, Nature Biotechnology.

[72]  Arun Kumar,et al.  Inherent Predictability, Requirements on the Ensemble Size, and Complementarity , 2015 .

[73]  Verónica Bolón-Canedo,et al.  Data classification using an ensemble of filters , 2014, Neurocomputing.

[74]  Nicoletta Dessì,et al.  Exploiting the ensemble paradigm for stable feature selection: A case study on high-dimensional genomic data , 2017, Inf. Fusion.

[75]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.

[76]  Stephen L. Smith,et al.  A comparison of evolved linear and non-linear ensemble vote aggregators , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[77]  Russ B. Altman,et al.  Improving the explainability of Random Forest classifier - user centered approach , 2018, PSB.

[78]  Isabelle Herlin,et al.  Quantification of uncertainties from ensembles of simulations , 2016 .

[79]  Chris Huntingford,et al.  Model complexity versus ensemble size: allocating resources for climate prediction , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[80]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[81]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[82]  Verónica Bolón-Canedo,et al.  An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[83]  Beatriz Remeseiro,et al.  A Methodology for Improving Tear Film Lipid Layer Classification , 2014, IEEE Journal of Biomedical and Health Informatics.

[84]  Taghi M. Khoshgoftaar,et al.  A Comparative Study of Ensemble Feature Selection Techniques for Software Defect Prediction , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[85]  Marek Kurzynski,et al.  Optimal selection of ensemble classifiers using measures of competence and diversity of base classifiers , 2014, Neurocomputing.

[86]  Jana Novovicová,et al.  Evaluating Stability and Comparing Output of Feature Selectors that Optimize Feature Subset Cardinality , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[87]  Zengyou He,et al.  Stable Feature Selection for Biomarker Discovery , 2010, Comput. Biol. Chem..

[88]  Vincent Barra,et al.  A new feature selection approach based on ensemble methods in semi-supervised classification , 2015, Pattern Analysis and Applications.

[89]  Verónica Bolón-Canedo,et al.  On the scalability of feature selection methods on high-dimensional data , 2017, Knowledge and Information Systems.

[90]  A Polyakova,et al.  A study of fuzzy logic ensemble system performance on face recognition problem , 2017 .

[91]  Dominik Heider,et al.  EFS: an ensemble feature selection tool implemented as R-package and web-application , 2017, BioData Mining.

[92]  Limsoon Wong,et al.  Evaluating feature-selection stability in next-generation proteomics , 2016, J. Bioinform. Comput. Biol..

[93]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[94]  Meng Luo,et al.  Compound feature selection and parameter optimization of ELM for fault diagnosis of rolling element bearings. , 2016, ISA transactions.

[95]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[96]  Mohammad Hossein Moattar,et al.  A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. , 2016, Genomics.

[97]  Yuchou Chang,et al.  Consensus unsupervised feature ranking from multiple views , 2008, Pattern Recognit. Lett..

[98]  Lawrence Mitchell,et al.  Parallel classification and feature selection in microarray data using SPRINT , 2014, Concurr. Comput. Pract. Exp..

[99]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[100]  Ge Yu,et al.  Parallel ensemble of online sequential extreme learning machine based on MapReduce , 2016, Neurocomputing.

[101]  Terry Windeatt,et al.  Stopping Criteria for Ensemble-Based Feature Selection , 2007, MCS.

[102]  Lior Rokach,et al.  Pattern Classification Using Ensemble Methods , 2009, Series in Machine Perception and Artificial Intelligence.

[103]  Vladimir Nikulin On the Homogeneous Ensembling via Balanced Subsets Combined with Wilcoxon-Based Feature Selection , 2012, RSCTC.

[104]  Douglas W. Oard,et al.  Combining feature selectors for text classification , 2006, CIKM '06.

[105]  Feng Duan,et al.  Recognizing the Gradual Changes in sEMG Characteristics Based on Incremental Learning of Wavelet Neural Network Ensemble , 2017, IEEE Transactions on Industrial Electronics.

[106]  Robert P. W. Duin,et al.  An experimental study on diversity for bagging and boosting with linear classifiers , 2002, Inf. Fusion.

[107]  Gonzalo Martínez-Muñoz,et al.  Out-of-bag estimation of the optimal sample size in bagging , 2010, Pattern Recognit..

[108]  Andrea Esuli,et al.  Feature Selection for Ordinal Text Classification , 2014, Neural Computation.

[109]  Luiz Eduardo Soares de Oliveira,et al.  Unsupervised feature selection for ensemble of classifiers , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[110]  Yoav Freund,et al.  Boosting: Foundations and Algorithms , 2012 .

[111]  Amparo Alonso-Betanzos,et al.  One-Class Convex Hull-Based Algorithm for Classification in Distributed Environments , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[112]  Gavin Brown,et al.  On the Stability of Feature Selection Algorithms , 2017, J. Mach. Learn. Res..

[113]  Hareton K. N. Leung,et al.  Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering , 2016, IEEE Trans. Knowl. Data Eng..

[114]  David W. Opitz,et al.  Feature Selection for Ensembles , 1999, AAAI/IAAI.

[115]  Daniel Hernández-Lobato,et al.  How large should ensembles of classifiers be? , 2013, Pattern Recognit..

[116]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[117]  Amparo Alonso-Betanzos,et al.  Reducing dimensionality in a database of sleep EEG arousals , 2011, Expert Syst. Appl..

[118]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[119]  Jaideep Srivastava,et al.  Robust Feature Selection Technique Using Rank Aggregation , 2014, Appl. Artif. Intell..

[120]  Markus Hofmann,et al.  RapidMiner: Data Mining Use Cases and Business Analytics Applications , 2013 .

[121]  Haytham Elghazel,et al.  Unsupervised feature selection with ensemble learning , 2013, Machine Learning.

[122]  Yasir Hamid,et al.  Feature selection techniques for intrusion detection using non-bio-inspired and bio-inspired optimization algorithms , 2017, Journal of Communications and Information Networks.

[123]  Roberto Guzmán-Martínez,et al.  Feature Selection Stability Assessment Based on the Jensen-Shannon Divergence , 2011, ECML/PKDD.

[124]  Gregory Ditzler,et al.  A Bootstrap Based Neyman-Pearson Test for Identifying Variable Importance , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[125]  Chris H. Q. Ding,et al.  Stable feature selection via dense feature groups , 2008, KDD.

[126]  Verónica Bolón-Canedo,et al.  Fast‐mRMR: Fast Minimum Redundancy Maximum Relevance Algorithm for High‐Dimensional Big Data , 2017, Int. J. Intell. Syst..

[127]  Mohsen Moshki,et al.  Scalable Feature Selection in High-Dimensional Data Based on GRASP , 2015, Appl. Artif. Intell..

[128]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[129]  Huan Liu,et al.  Ensemble Feature Selection in Face Recognition: ICMLA 2012 Challenge , 2012, 2012 11th International Conference on Machine Learning and Applications.

[130]  Moacir P. Ponti,et al.  Ensembles of Optimum-Path Forest Classifiers Using Input Data Manipulation and Undersampling , 2013, MCS.

[131]  Hao Tian,et al.  A new feature extraction and selection scheme for hybrid fault diagnosis of gearbox , 2011, Expert Syst. Appl..

[132]  Brigitte Chebel-Morello,et al.  Feature selection for fault detection systems: application to the Tennessee Eastman process , 2011, 2011 IEEE International Conference on Automation Science and Engineering.

[133]  John Langford,et al.  Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.

[134]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[135]  Rammohan Mallipeddi,et al.  Ensemble based face recognition using discriminant PCA Features , 2012, 2012 IEEE Congress on Evolutionary Computation.

[136]  Katharina Morik,et al.  Fast-Ensembles of Minimum Redundancy Feature Selection , 2010, LWA.

[137]  Taghi M. Khoshgoftaar,et al.  A survey of stability analysis of feature subset selection techniques , 2013, 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI).

[138]  Bartosz Krawczyk,et al.  Selecting locally specialised classifiers for one-class classification ensembles , 2017, Pattern Analysis and Applications.

[139]  Hui Xiao,et al.  Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes , 2009, Bioinform..

[140]  Ludmila I. Kuncheva,et al.  A stability index for feature selection , 2007, Artificial Intelligence and Applications.

[141]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[142]  Qingshan Jiang,et al.  Feature selection via maximizing global information gain for text classification , 2013, Knowl. Based Syst..

[143]  U. Rajendra Acharya,et al.  Ensemble selection for feature-based classification of diabetic maculopathy images , 2013, Comput. Biol. Medicine.

[144]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[145]  Bertha Guijarro-Berdiñas,et al.  A survey of methods for distributed machine learning , 2012, Progress in Artificial Intelligence.

[146]  Verónica Bolón-Canedo,et al.  Centralized vs. distributed feature selection methods based on data complexity measures , 2017, Knowl. Based Syst..

[147]  Steve R. Gunn,et al.  Ensemble Algorithms for Feature Selection , 2004, Deterministic and Statistical Methods in Machine Learning.

[148]  Taghi M. Khoshgoftaar,et al.  Ensemble Feature Selection Technique for Software Quality Classification , 2010, International Conference on Software Engineering and Knowledge Engineering.

[149]  J. Torres-Sospedra,et al.  A research on combination methods for ensembles of multilayer feedforward , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[150]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[151]  Wen Gao,et al.  Hierarchical Ensemble of Global and Local Classifiers for Face Recognition , 2009, IEEE Trans. Image Process..

[152]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[153]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[154]  Verónica Bolón-Canedo,et al.  Exploring the consequences of distributed feature selection in DNA microarray data , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).