Maximum Relevancy Minimum Redundancy Based Feature Subset Selection using Ant Colony Optimization

In recent years dimensionality reduction of data has gained a lot of interest from the machine learning community, partly due to the huge amount of data available for processing. Classical machine learning algorithms were designed to work with limited amount of data where more emphasis was given to the learning methodology of such algorithms e.g. learning crisp rules from a dataset. Recent explosion of data, proved detrimental to the accuracy of classification algorithms due to lowering storage costs and inexpensive processing power. Therefore, feature selection which is a key technique in dimensionality reduction has become an important frontier in machine learning research. In this paper, we propose a novel filter based feature selection method. Proposed method is based on Ant Colony Optimization (ACO), and maximum Relevance and Minimum Redundancy (mRMR) for efficient subset evaluation. Although wrapper methods frequently use ACO for feature subset generation but ACO is not thoroughly studied in the development of filter methods. Proposed method takes both feature relevancy and feature redundancy in account. Our research ensures selection of features which are highly relevant with the target concept, weakly redundantly with each other and useful predictor for classification algorithms. We have performed an extensive experimentation over eleven publicly available datasets and three popular machine learning classifiers. Experimental results of comparisons show that proposed method achieves higher classification accuracy and employ reduced number of features.

[1]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[2]  Hong Hu,et al.  Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[3]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[4]  M. F. Zaiyadi,et al.  A Proposed Hybrid Approach for Feature Selection in Text Document Categorization , 2010 .

[5]  Dimitrios Gunopulos,et al.  Selective Bayesian Classifier: Feature Selection For The Naive Bayesian Classifier Using Decision Trees , 2002 .

[6]  Nasser Ghasem-Aghaee,et al.  Using Ant Colony Optimization-Based Selected Features for Predicting Post-synaptic Activity in Proteins , 2008, EvoBIO.

[7]  Magdalene Marinaki,et al.  Ant colony and particle swarm optimization for financial classification problems , 2009, Expert Syst. Appl..

[8]  Muhammad Zubair Shafiq,et al.  Guidelines to Select Machine Learning Scheme for Classification of Biomedical Datasets , 2009, EvoBIO.

[9]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[10]  Kazuyuki Murase,et al.  A new hybrid ant colony optimization algorithm for feature selection , 2012, Expert Syst. Appl..

[11]  J. K. Bertrand,et al.  The ant colony algorithm for feature selection in high-dimension gene expression data for disease classification. , 2007, Mathematical medicine and biology : a journal of the IMA.

[12]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Nasser Ghasem-Aghaee,et al.  A novel ACO-GA hybrid algorithm for feature selection in protein function prediction , 2009, Expert Syst. Appl..

[14]  Xiang Li,et al.  Ant colony optimization and mutual information hybrid algorithms for feature subset selection in equipment fault diagnosis , 2008, 2008 10th International Conference on Control, Automation, Robotics and Vision.

[15]  Ejaz Ahmed,et al.  Drug Design and Discovery using Differential Evolution , 2016 .

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Sreeram Ramakrishnan,et al.  A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..

[18]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[19]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[20]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[21]  Qiang Shen,et al.  Fuzzy-rough data reduction with ant colony optimization , 2005, Fuzzy Sets Syst..

[22]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[23]  Fazl e Hadi,et al.  Solving Traveling Salesman Problem through Optimization Techniques Using Genetic Algorithm and Ant Colony Optimization , 2016 .

[24]  Masao Fukushima,et al.  Tabu search for attribute reduction in rough set theory , 2008, Soft Comput..

[25]  Salabat Khan,et al.  Correlation as a Heuristic for Accurate and Comprehensible Ant Colony Optimization Based Classifiers , 2013, IEEE Transactions on Evolutionary Computation.

[26]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[27]  Hui-Hua Yang,et al.  Ant colony optimization based network intrusion feature selection and detection , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[28]  Tao Wang,et al.  A Hybrid Feature Selection Algorithm: Combination of Symmetrical Uncertainty and Genetic Algorithms , 2008 .