Feature Selection Techniques for Classification: A widely applicable code library

In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. This manuscript overviews concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification based on the complexity: filter, embedded, and wrappers methods. Some real-world applications are included. We conclude this work by identifying trends and challenges of feature selection research and development while providing a code library of methods selected from recent literature.

[1]  T. Wieczorek,et al.  Comparison of feature ranking methods based on information entropy , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[2]  Pablo M. Granitto,et al.  SVM Based Feature Selection: Why Are We Using the Dual? , 2010, IBERAMIA.

[3]  Simone Melzi,et al.  Online Feature Selection for Visual Tracking , 2016, BMVC.

[4]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[5]  Yue Han,et al.  Stable Gene Selection from Microarray Data via Sample Weighting , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Vittorio Murino,et al.  Conversationally-inspired stylometric features for authorship attribution in instant messaging , 2012, ACM Multimedia.

[7]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[8]  Pat Langley,et al.  Models of Incremental Concept Formation , 1990, Artif. Intell..

[9]  Marco Cristani,et al.  Infinite Feature Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[11]  Vittorio Murino,et al.  Trusting Skype: Learning the Way People Chat for Fast User Recognition and Verification , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[12]  Marco Zaffalon,et al.  Robust Feature Selection by Mutual Information Distributions , 2002, UAI.

[13]  Alessandro Vinciarelli,et al.  Personality in Computational Advertising: A Benchmark , 2016, EMPIRE@RecSys.

[14]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[15]  Isabelle Guyon,et al.  Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark , 2007, Pattern Recognit. Lett..

[16]  Marco Cristani,et al.  Just the Way You Chat: Linking Personality, Style and Recognizability in Chats , 2014, HBU.

[17]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[18]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[19]  Marco Cristani,et al.  Statistical Analysis of Personality and Identity in Chats Using a Keylogging Platform , 2014, ICMI.

[20]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[21]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[22]  Vittorio Murino,et al.  Reading between the turns: Statistical modeling for identity recognition and verification in chats , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[23]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Simone Melzi,et al.  Object Tracking via Dynamic Feature Selection Processes , 2016, ArXiv.

[25]  Simone Melzi,et al.  Ranking to Learn: - Feature Ranking and Selection via Eigenvector Centrality , 2016, NFMCP@PKDD/ECML.

[26]  Ludmila I. Kuncheva,et al.  A stability index for feature selection , 2007, Artificial Intelligence and Applications.

[27]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .