A hybrid approach to discover Bayesian networks from databases using evolutionary programming

Describes a data mining approach that employs evolutionary programming to discover knowledge represented in Bayesian networks. There are two different approaches to the network learning problem. The first one uses dependency analysis, while the second one searches good network structures according to a metric. Unfortunately, both approaches have their own drawbacks. Thus, we propose a hybrid algorithm of the two approaches, which consists of two phases, namely, the conditional independence test and the search phases. A new operator is introduced to further enhance the search efficiency. We conduct a number of experiments and compare the hybrid algorithm with our previous algorithm, MDLEP, which uses EP for network learning. The empirical results illustrate that the new approach has better performance. We apply the approach to data sets of direct marketing and compare the performance of the evolved Bayesian networks obtained by the new algorithm with the models generated by other methods. In the comparison, the induced Bayesian networks produced by the new algorithm outperform the other models.

[1]  R. Blattberg,et al.  Database marketing , 1997 .

[2]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[3]  Rolf Stadler,et al.  Discovering Data Mining: From Concept to Implementation , 1997 .

[4]  Mitchell P. Marcus,et al.  Learning bayesian networks for solving real-world problems , 1998 .

[5]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[6]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[7]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[8]  Olivia Parr Rud,et al.  Data Mining Cookbook: Modeling Data for Marketing, Risk, and Customer Relationship Management , 2000 .

[9]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[10]  Siddhartha Bhattacharyya,et al.  Direct Marketing Response Models Using Genetic Algorithms , 1998, KDD.

[11]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[12]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[13]  Nissan Levin,et al.  Issues and problems in applying neural computing to target marketing , 1997 .

[14]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[15]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[16]  Kwong-Sak Leung,et al.  Using Evolutionary Programming and Minimum Description Length Principle for Data Mining of Bayesian Networks , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms: A Performance Analysis of Control Parameters , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[19]  Eamonn J. Keogh,et al.  Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches , 1999, AISTATS.