A Genetic Algorithm with Entropy Based Probabilistic Initialization and Memory for Automated Rule Mining

In recent years, Genetic Algorithms (GAs) have shown promising results in the domain of data mining. However, unreasonably long running times due to the high computational cost associated with fitness evaluations dissuades the use of GAs for knowledge discovery. In this paper we propose an enhanced genetic algorithm for automated rule mining. The proposed approach supplements the GA with an entropy based probabilistic initialization such that the initial population has more relevant and informative attributes. Further, the GA is augmented with a memory to store fitness scores. The suggested additions have a twofold advantage. Firstly, it lessens the candidate rules’ search space making the search more effective to evolve better fit rules in lesser number of generations. Secondly, it reduces number of total fitness evaluations required giving rise to a gain in running time. The enhanced GA has been employed to datasets from UCI machine learning repository and has shown encouraging results.

[1]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[2]  Vasant Dhar,et al.  Discovering Interesting Patterns for Investment Decision Making with GLOWER ☹—A Genetic Learner Overlaid with Entropy Reduction , 2000, Data Mining and Knowledge Discovery.

[3]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[4]  Shengxiang Yang,et al.  Associative Memory Scheme for Genetic Algorithms in Dynamic Environments , 2006, EvoWorkshops.

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[7]  Kamal Kant Bharadwaj,et al.  Discovery of Exceptions: A Step towards Perfection , 2009, 2009 Third International Conference on Network and System Security.

[8]  Christine M. Anderson-Cook,et al.  A genetic algorithm with memory for mixed discrete–continuous design optimization , 2003 .

[9]  Erhan Akin,et al.  An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules , 2006, Soft Comput..

[10]  Rolf Drechsler,et al.  Applications of Evolutionary Computing, EvoWorkshops 2008: EvoCOMNET, EvoFIN, EvoHOT, EvoIASP, EvoMUSART, EvoNUM, EvoSTOC, and EvoTransLog, Naples, Italy, March 26-28, 2008. Proceedings , 2008, EvoWorkshops.

[11]  M. Amparo Vila,et al.  Applying Genetic Algorithms to the Feature Selection Problem in Information Retrieval , 1998, FQAS.

[12]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  J. Juan Liu,et al.  An extended genetic rule induction algorithm , 2000, CEC.

[14]  S. Subbotin,et al.  Entropy Based Evolutionary Search for Feature Selection , 2007, 2007 9th International Conference - The Experience of Designing and Applications of CAD Systems in Microelectronics.

[15]  Alex A. Freitas,et al.  Discovering interesting prediction rules with a genetic algorithm , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[16]  Kamal Kant Bharadwaj A parellel genetic algorithm approach for automated discovery of censored production rules , 2007, Artificial Intelligence and Applications.

[17]  Dr. Alex A. Freitas Data Mining and Knowledge Discovery with Evolutionary Algorithms , 2002, Natural Computing Series.

[18]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .