Review of feature selection for solving classification problems

Classification of data crosses different domains has been extensively researched and is one of the basic methods for distinguishing one from another, as we need to know which belongs to which group. It has the capabilities to infer the unseen dataset with unknown class by analyzing its structural similarity to a given dataset with known classes. Reliability on classification results is very crucial issues. The higher the accuracy of generated classification results, the better the classifier is. There are constantly seeking to increase the accuracy of classification, either through existing techniques or through development of new ones. Different processes are applied to improve the accuracy of classification performance. While most existing methods addressed this task aim at improving the classifier techniques, this paper focused on reducing the number of features in dataset by selecting only the relevant features before giving the dataset to classifier. This motivates the need for sufficient methods that capable of selecting the relevant features with minimal information loss. The aim is to reduce the workload of classifier by using feature selection methods. With the focus on classification performance accuracy, this paper highlights and discusses the concept, abilities and application of feature selection for various applications in classification problem. From the review, classification with feature selection methods has shown impressive results with significant accuracy when compared to classification without feature selection.

[1]  Hao Dong,et al.  An improved particle swarm optimization for feature selection , 2011 .

[2]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[3]  Kun-Huang Chen,et al.  An improved particle swarm optimization for feature selection , 2011, Intell. Data Anal..

[4]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[5]  Li-Yeh Chuang,et al.  Feature Selection using PSO-SVM , 2007, IMECS.

[6]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[7]  Richard Jensen,et al.  Combining rough and fuzzy sets for feature selection , 2004 .

[8]  Poh Ling Tay Iterative Bayesian Model Averaging For Patients Survival Analysis , 2010 .

[9]  K. Thanushkodi,et al.  New Particle Swarm Optimization for Feature Selection and Classification of Microcalcifications in Mammograms , 2008, 2008 International Conference on Signal Processing, Communications and Networking.

[10]  M. M. A. Salama,et al.  Particle swarm optimization feature selection for the classification of conducting particles in transformer oil , 2011, IEEE Transactions on Dielectrics and Electrical Insulation.

[11]  Keith C. C. Chan,et al.  An effective algorithm for discovering fuzzy rules in relational databases , 1998, 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36228).

[12]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[13]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[14]  L. Chuang,et al.  Chaotic maps in binary particle swarm optimization for feature selection , 2008, 2008 IEEE Conference on Soft Computing in Industrial Applications.

[15]  Ahmed Al-Ani,et al.  Feature Subset Selection Using Ant Colony Optimization , 2008 .

[16]  Qiang Shen,et al.  Aiding Fuzzy Rule Induction with Fuzzy Rough Attribute Reduction , 2002 .

[17]  Shyi-Ming Chen,et al.  A new method for generating fuzzy rules from numerical data for handling classification problems , 2001, Appl. Artif. Intell..

[18]  Nooritawati Md Tahir,et al.  Feature selection of breast cancer based on Principal Component Analysis , 2010, 2010 6th International Colloquium on Signal Processing & its Applications.

[19]  E. Boerwinkle,et al.  Feature (gene) selection in gene expression-based tumor classification. , 2001, Molecular genetics and metabolism.

[20]  Azuraliza Abu Bakar,et al.  Filter-wrapper approach to feature selection using RST-DPSO for mining protein function , 2009, 2009 2nd Conference on Data Mining and Optimization.

[21]  Amir Ahmad,et al.  Data Transformation for Decision Tree Ensembles , 2009 .

[22]  Debahuti Mishra,et al.  Feature Selection for Cancer Classification: A Signal-to-noise Ratio Approach , 2011 .

[23]  Jin Li,et al.  Using cooperative game theory to optimize the feature selection problem , 2012, Neurocomputing.

[24]  Adrian E. Raftery,et al.  Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data , 2009, BMC Bioinformatics.