A novel hybrid ACO-GA algorithm for text feature selection

In our previous work we have proposed an ant colony optimization (ACO) algorithm for feature selection. In this paper, we hybridize the algorithm with a genetic algorithm (GA) to obtain excellent features of two algorithms by synthesizing them. Proposed algorithm is applied to a challenging feature selection problem. This is a data mining problem involving the categorization of text documents. We report the extensive comparison between our proposed algorithm and three existing algorithms - ACO-based, information gain (IG) and CHI algorithms proposed in the literature. Proposed algorithm is easily implemented and because of use of a simple classifier in that, its computational complexity is very low. Experimentations are carried out on Reuters-21578 dataset. Simulation results on Reuters-21578 dataset show the superiority of the proposed algorithm.

[1]  Nasser Ghasem-Aghaee,et al.  Application of ant colony optimization for feature selection in text categorization , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[2]  Karim Faez,et al.  Feature Selection Using Ant Colony Optimization (ACO): A New Method and Comparative Study in the Application of Face Recognition System , 2007, ICDM.

[3]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[4]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[5]  Bruce Draper,et al.  Feature selection from huge feature sets in the context of computer vision , 2000 .

[6]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[7]  Z. Michalewicz,et al.  A new version of ant system for subset problems , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[8]  J. Kittler,et al.  Feature Set Search Alborithms , 1978 .

[9]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[10]  Hong Hu,et al.  Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[11]  Stan Matwin,et al.  Beyond the Bag of Words: A Text Representation for Sentence Selection , 2006, Canadian Conference on AI.

[12]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[13]  Luca Maria Gambardella,et al.  Solving symmetric and asymmetric TSPs by ant colonies , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[14]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[15]  T. Stützle,et al.  MAX-MIN Ant System and local search for the traveling salesman problem , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[16]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[17]  G. Di Caro,et al.  Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[18]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[19]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[20]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[21]  Marco Dorigo,et al.  Learning by probabilistic Boolean networks , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[22]  Ahmed Al-Ani Ant Colony Optimization for Feature Subset Selection , 2005, WEC.

[23]  Dunja Mladenic,et al.  Feature Selection for Dimensionality Reduction , 2005, SLSFS.

[24]  Hussein A. Abbass,et al.  Classification rule discovery with ant colony optimization , 2003, IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003..

[25]  Nasser Ghasem-Aghaee,et al.  Using Ant Colony Optimization-Based Selected Features for Predicting Post-synaptic Activity in Proteins , 2008, EvoBIO.

[26]  Richard Jensen,et al.  Combining rough and fuzzy sets for feature selection , 2004 .

[27]  Marco Dorigo,et al.  HC-ACO: The Hyper-Cube Framework for Ant Colony Optimization , 2001 .

[28]  Mohammad Davarpanah Jazi,et al.  A Novel Text-Independent Speaker Verification System Using Ant Colony Optimization Algorithm , 2008, ICISP.

[29]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[30]  Wenqian Shang,et al.  A novel feature selection algorithm for text categorization , 2007, Expert Syst. Appl..