Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection

Hybrid genetic algorithms (GA) and artificial neural networks (ANN) are not new in the machine learning culture. Such hybrid systems have been shown to be very successful in classification and prediction problems. However, little attention has been focused on this architecture as a feature selection method and the consequent significance of the ANN activation function and the number of GA evaluations on the feature selection performance. The activation function is one of the core components of the ANN architecture and influences the learning and generalization capability of the network. Meanwhile the GA searches for an optimal ANN classifier given a set of chromosomes selected from those available. The objective of the GA is to combine the search for optimum chromosome choices with that of finding an optimum classifier for each choice. The process operates as a form of co-evolution with the eventual objective of finding an optimum chromosome selection rather than an optimum classifier. The selection of an optimum chromosome set is referred to in this paper as feature selection. Quantitative comparisons of four of the most commonly used ANN activation functions against ten GA evaluation step counts and three population sizes are presented. These studies employ four data sets with high dimension and low significant datum instances. That is to say that each datum has a high attribute count and the unusual or abnormal data are sparse within the data set. Results suggest that the hyperbolic tangent (tanh) activation function outperforms other common activation functions by extracting a smaller, but more significant feature set. Furthermore, it was found that fitness evaluation sizes ranging from 20,000 to 40,000 within populations ranging from 200 to 300, deliver optimum feature selection capability. Again, optimum in this sense meaning a smaller but more significant feature set.

[1]  Arputharaj Kannan,et al.  A genetic-algorithm based neural network short-term forecasting framework for database intrusion prediction system , 2006, Soft Comput..

[2]  Kenneth A. De Jong,et al.  An Analysis of the Interacting Roles of Population Size and Crossover in Genetic Algorithms , 1990, PPSN.

[3]  Ru-Sheng Liu,et al.  Multiclass microarray data classification using GA/ANN method , 2006 .

[4]  Igor S. Pandzic,et al.  Real-time language independent lip synchronization method using a genetic algorithm , 2006, Signal Process..

[5]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[6]  Joaquín Dopazo,et al.  Using a Genetic Algorithm and a Perceptron for Feature Selection and Supervised Class Learning in DNA Microarray Data , 2003, Artificial Intelligence Review.

[7]  Emad A. M. Andrews Shenouda A Quantitative Comparison of Different MLP Activation Functions in Classification , 2006, ISNN.

[8]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[9]  W. Vach,et al.  On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. , 2000, Statistics in medicine.

[10]  Ru-Sheng Liu,et al.  Multiclass Microarray Data Classification Using GA/ANN Method , 2006, PRICAI.

[11]  Hugh M. Cartwright,et al.  Using Artificial Intelligence in Chemistry and Biology: A Practical Guide , 2008 .

[12]  Ali Mohebbi,et al.  Design of artificial neural networks using a genetic algorithm to predict collection efficiency in venturi scrubbers. , 2008, Journal of hazardous materials.

[13]  Robert G. Beiko,et al.  GANN: Genetic algorithm neural networks for the detection of conserved combinations of features in DNA , 2005, BMC Bioinformatics.

[14]  Brijesh Verma,et al.  A novel neural-genetic algorithm to find the most significant combination of features in digital mammograms , 2007, Appl. Soft Comput..

[15]  Steven M. Bachrach,et al.  Chemistry publication – making the revolution , 2009, J. Cheminformatics.

[16]  Vitoantonio Bevilacqua,et al.  Genetic Algorithms and Artificial Neural Networks in Microarray Data Analysis: a Distributed Approach , 2006 .

[17]  Amanda C. Schierz Virtual screening of bioassay data , 2009, J. Cheminformatics.

[18]  H S Cho,et al.  cDNA Microarray Data Based Classification of Cancers Using Neural Networks and Genetic Algorithms , 2003 .

[19]  Tom M. Mitchell,et al.  Does Machine Learning Really Work? , 1997, AI Mag..