Relevance-redundancy feature selection based on ant colony optimization

The curse of dimensionality is a well-known problem in pattern recognition in which the number of patterns is smaller than the number of features in the datasets. Often, many of the features are irrelevant and redundant for the classification tasks. Therefore, the feature selection becomes an essential technique to reduce the dimensionality of the datasets. In this paper, unsupervised and multivariate filter-based feature selection methods are proposed by analyzing the relevance and redundancy of features. In the methods, the search space is represented as a graph and then the ant colony optimization is used to rank the features. Furthermore, a novel heuristic information measure is proposed to improve the accuracy of the methods by considering the similarity between subsets of features. The performance of the proposed methods was compared to the well-known univariate and multivariate methods using different classifiers. The results indicated that the proposed methods outperform the existing methods. New unsupervised feature selection methods using ant colony optimization are proposed.A new heuristic information measure is defined to enhance the accuracy of the methods.The proposed methods can efficiently handle both irrelevant and redundant features.The methods are compared to the well-known univariate and multivariate filter methods.The results show the efficiency and effectiveness of the proposed methods.

[1]  Marcel J. T. Reinders,et al.  Random subspace method for multivariate feature selection , 2006, Pattern Recognit. Lett..

[2]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[3]  Magdalene Marinaki,et al.  Ant colony and particle swarm optimization for financial classification problems , 2009, Expert Syst. Appl..

[4]  Riyaz Sikora,et al.  Framework for efficient feature selection in genetic algorithm based data mining , 2007, Eur. J. Oper. Res..

[5]  G. Di Caro,et al.  Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[6]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[7]  Yixin Chen,et al.  Efficient ant colony optimization for image feature selection , 2013, Signal Process..

[8]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[9]  Karim Faez,et al.  An improved feature selection method based on ant colony optimization (ACO) evaluated on face recognition system , 2008, Appl. Math. Comput..

[10]  Xiaoming Xu,et al.  A Wrapper for Feature Selection Based on Mutual Information , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[11]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[12]  Parham Moradi,et al.  An unsupervised feature selection algorithm based on ant colony optimization , 2014, Eng. Appl. Artif. Intell..

[13]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[14]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Nasser Ghasem-Aghaee,et al.  A novel ACO-GA hybrid algorithm for feature selection in protein function prediction , 2009, Expert Syst. Appl..

[16]  Nicole Vincent,et al.  Feature selection combining genetic algorithm and Adaboost classifiers , 2008, 2008 19th International Conference on Pattern Recognition.

[17]  Yukyee Leung,et al.  A Multiple-Filter-Multiple-Wrapper Approach to Gene Selection and Microarray Data Classification , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Constantin F. Aliferis,et al.  Using the GEMS System for Cancer Diagnosis and Biomarker Discovery from Microarray Gene Expression Data , 2005, AAAI.

[19]  K. I. Ramachandran,et al.  Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing , 2007 .

[20]  M Dorigo,et al.  Ant colonies for the travelling salesman problem. , 1997, Bio Systems.

[21]  Hui-Huang Hsu,et al.  A Hybrid Feature Selection Mechanism , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[22]  M. Esmel ElAlami A filter model for feature subset selection based on genetic algorithm , 2009, Knowl. Based Syst..

[23]  Zhen Liu,et al.  A new feature selection algorithm based on binomial hypothesis testing for spam filtering , 2011, Knowl. Based Syst..

[24]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[25]  Constantine Kotropoulos,et al.  Feature Selection Based on Mutual Correlation , 2006, CIARP.

[26]  Mohammad Davarpanah Jazi,et al.  Text-independent speaker verification using ant colony optimization-based selected features , 2011, Expert Syst. Appl..

[27]  Zhong Yan,et al.  Ant Colony Optimization for Feature Selection in Face Recognition , 2004, ICBA.

[28]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[29]  Sergios Theodoridis,et al.  Pattern Recognition, Fourth Edition , 2008 .

[30]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[31]  Leslie S. Smith,et al.  Feature subset selection in large dimensionality domains , 2010, Pattern Recognit..

[32]  Jing Zhao,et al.  A Modified Ant Colony Optimization Algorithm for Tumor Marker Gene Selection , 2009, Genom. Proteom. Bioinform..

[33]  Ratna Babu Chinnam,et al.  mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification , 2011, Inf. Sci..

[34]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[36]  Anil K. Jain,et al.  Large scale feature selection using modified random mutation hill climbing , 2004, ICPR 2004.

[37]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[38]  Wlodzislaw Duch,et al.  Feature Selection for High-Dimensional Data - A Pearson Redundancy Based Filter , 2008, Computer Recognition Systems 2.

[39]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[40]  Hong Hu,et al.  Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[41]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[42]  Alper Ekrem Murat,et al.  A discrete particle swarm optimization method for feature selection in binary classification problems , 2010, Eur. J. Oper. Res..

[43]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[44]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[45]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[46]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[47]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[48]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[49]  Yuming Zhou,et al.  Selecting feature subset for high dimensional data via the propositional FOIL rules , 2013, Pattern Recognit..

[50]  Wenqian Shang,et al.  A novel feature selection algorithm for text categorization , 2007, Expert Syst. Appl..

[51]  Thomas Stützle,et al.  Ant Colony Optimization: Overview and Recent Advances , 2018, Handbook of Metaheuristics.

[52]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[53]  Lei Xu,et al.  Best first strategy for feature selection , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[54]  Filiberto Pla,et al.  Supervised feature selection by clustering using conditional mutual information-based distances , 2010, Pattern Recognit..

[55]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[56]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[57]  Mário A. T. Figueiredo,et al.  An unsupervised approach to feature discretization and selection , 2012, Pattern Recognit..

[58]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[59]  Cheng-Lung Huang,et al.  A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting , 2009, Expert Syst. Appl..