A Review of Feature Selection Algorithms for Data Mining Techniques

Feature selection is a pre-processing step, used to improve the mining performance by reducing data dimensionality. Even though there exists a number of feature selection algorithms, still it is an active research area in data mining, machine learning and pattern recognition communities. Many feature selection algorithms confront severe challenges in terms of effectiveness and efficiency, because of recent increase in data dimensionality (data with thousands of features or attributes or variables). This paper analyses some existing popular feature selection algorithms, addresses the strengths and challenges of those algorithms.

[1]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[2]  Wang Liping,et al.  Feature Selection Algorithm Based on Conditional Dynamic Mutual Information , 2015 .

[3]  Huan Liu,et al.  Searching for Interacting Features , 2007, IJCAI.

[4]  Young-Koo Lee,et al.  Confident wrapper-type semi-supervised feature selection using an ensemble classifier , 2011, 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC).

[5]  Lloyd A. Smith,et al.  Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper , 1999, FLAIRS.

[6]  Jian Yang,et al.  A cluster-based sequential feature selection algorithm , 2013, 2013 Ninth International Conference on Natural Computation (ICNC).

[7]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[8]  Young-Sup Hwang Wrapper-based Feature Selection Using Support Vector Machine , .

[9]  Kashif Javed,et al.  Feature Selection Based on Class-Dependent Densities for High-Dimensional Binary Data , 2012, IEEE Transactions on Knowledge and Data Engineering.

[10]  Filippo Menczer,et al.  Feature selection in unsupervised learning via evolutionary search , 2000, KDD '00.

[11]  Jihong Liu,et al.  A hybrid feature selection method for data sets of thousands of variables , 2010, 2010 2nd International Conference on Advanced Computer Control.

[12]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[15]  Youliang Zhang,et al.  A Hybrid Feature Selection Method Based on Rough Conditional Mutual Information and Naive Bayesian Classifier , 2014 .

[16]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[17]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[18]  Dr. B. M. Vidyavathi A New Approach to Feature Selection for Data Mining , 2011 .

[19]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[20]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[21]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.