Improving the stability of wrapper variable selection applied to binary classification

Wrapper variable selection methods are widely adopted in many applications, among which the design of classifiers. The main problem related to these approaches regards the stability of the selection, namely the exploitation of different training data set can lead to the selection of different variable subsets. This problem is particularly critical in applications where variable selection is used to interpret the behaviour of the process or phenomenon under consideration, i.e. to understand which among a potentially huge list of variables actually affect the classification. The paper proposes a method that improves the stability of the wrapper variable selection procedures while preserving and possibly improving the classification performance. Moreover three binary classifiers are performed in order to prove the effectiveness of the proposed method.

[1]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[2]  Licheng Jiao,et al.  Multi-layer Perceptrons with Embedded Feature Selection with Application in Cancer Classification ∗ , 2006 .

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Piotr Porwik,et al.  DIAGNOSING PARKINSON'S DISEASE USING THE CLASSIFICATION OF SPEECH SIGNALS , 2014 .

[5]  Donald Sofge,et al.  Improved Neural Modeling of Real-World Systems Using Genetic Algorithm Based Variable Selection , 2007, ArXiv.

[6]  Melanie Hilario,et al.  Stability of feature selection algorithms , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[7]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[8]  Riquan Zhang,et al.  Variable selection of varying dispersion student-t regression models , 2015, J. Syst. Sci. Complex..

[9]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[10]  Alan D. Carswell,et al.  Network Intrusion Detection Using a HNB Binary Classifier , 2015, 2015 17th UKSim-AMSS International Conference on Modelling and Simulation (UKSim).

[11]  Richard Nock,et al.  A hybrid filter/wrapper approach of feature selection using information theory , 2002, Pattern Recognit..

[12]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[13]  Wenyi Wang,et al.  Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors , 2016, Bioinform..

[14]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[15]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  Marco Vannucci,et al.  General Purpose Input Variables Extraction: A Genetic Algorithm Based Procedure GIVE A GAP , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  Peter D. Turney Technical note: Bias and the quantification of stability , 1995, Machine Learning.

[20]  Chetan Patil,et al.  Heart Disease Diagnosis using Support Vector Machine , 2011 .

[21]  Jose Miguel Puerta,et al.  A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets , 2011, Pattern Recognit. Lett..

[22]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[23]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[24]  Gregory F Cooper,et al.  A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets. , 2014, Journal of the American Medical Informatics Association : JAMIA.

[25]  Moshe Kam,et al.  New filter-based feature selection criteria for identifying differentially expressed genes , 2005, Fourth International Conference on Machine Learning and Applications (ICMLA'05).

[26]  Mehdi Khashei,et al.  Diagnosing Diabetes Type II Using a Soft Intelligent Binary Classification Model , 2012 .

[27]  Ludmila I. Kuncheva,et al.  A stability index for feature selection , 2007, Artificial Intelligence and Applications.

[28]  Yuan Yao,et al.  Variable selection method for fault isolation using least absolute shrinkage and selection operator (LASSO) , 2015 .

[29]  Filiberto Pla,et al.  Filter-Type Variable Selection Based on Information Measures for Regression Tasks , 2012, Entropy.

[30]  Anil K. Jain,et al.  39 Dimensionality and sample size considerations in pattern recognition practice , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[31]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[32]  Valentina Colla,et al.  Improving the stability of Sequential Forward variables selection , 2015, 2015 15th International Conference on Intelligent Systems Design and Applications (ISDA).

[33]  Marco Vannucci,et al.  Novel resampling method for the classification of imbalanced datasets for industrial and other real-world problems , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[34]  Rosziati Ibrahim,et al.  Fuzzy Soft Set based Classification for Mammogram Images , 2015 .

[35]  Shiqing Zhang,et al.  Feature selection filtering methods for emotion recognition in Chinese speech signal , 2008, 2008 9th International Conference on Signal Processing.

[36]  Mingqiu Wang,et al.  Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure , 2015 .

[37]  Gennady Poda,et al.  Efficient variable selection batch pruning algorithm for artificial neural networks , 2015 .

[38]  S. Stigler Francis Galton's Account of the Invention of Correlation , 1989 .

[39]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[40]  Zhongyang Fei,et al.  A variable selection aided residual generator design approach for process control and monitoring , 2016, Neurocomputing.

[41]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[42]  Colla Valentina,et al.  Variable selection through Genetic algorithms for classification purposes , 2010 .

[43]  Constantine Kotropoulos,et al.  Sequential forward feature selection with low computational cost , 2005, 2005 13th European Signal Processing Conference.

[44]  Jana Novovicová,et al.  Evaluating the Stability of Feature Selectors That Optimize Feature Subset Cardinality , 2008, SSPR/SPR.

[45]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[46]  Marco Vannucci,et al.  A Hybrid Feature Selection Method for Classification Purposes , 2014, 2014 European Modelling Symposium.

[47]  Xin Zhao,et al.  Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data , 2007, BMC Bioinformatics.

[48]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[49]  Sunghoon Kwon,et al.  Moderately clipped LASSO , 2015, Comput. Stat. Data Anal..

[50]  Peter A. Flach,et al.  Feature Selection with Labelled and Unlabelled Data , 2002 .

[51]  K.Z. Mao,et al.  Orthogonal forward selection and backward elimination algorithms for feature subset selection , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[52]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[53]  Huan Liu,et al.  Feature Selection: An Ever Evolving Frontier in Data Mining , 2010, FSDM.

[54]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[55]  Ji Zhu,et al.  Variable Selection for Model‐Based High‐Dimensional Clustering and Its Application to Microarray Data , 2008, Biometrics.

[56]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[57]  Tai-hoon Kim,et al.  Linear Correlation-Based Feature Selection for Network Intrusion Detection Model , 2013, SecNet.

[58]  Marco Vannucci,et al.  A method for resampling imbalanced datasets in binary classification tasks for real-world problems , 2014, Neurocomputing.

[59]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[60]  Monali Shetty,et al.  Data Mining Techniques for Real Time Intrusion Detection Systems , 2012 .

[61]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[62]  Lei Yu,et al.  Stable feature selection: theory and algorithms , 2012 .

[63]  Ron Kohavi,et al.  Wrappers for feature selection , 1997 .