Building Fine Bayesian Networks Aided by PSO-Based Feature Selection

A successful interpretation of data goes through discovering crucial relationships between variables. Such a task can be accomplished by a Bayesian network. The dark side is that, when lots of variables are involved, the learning of the network slows down and may lead to wrong results. In this study, we demonstrate the feasibility of applying an existing Particle Swarm Optimization (PSO)-based approach to feature selection for filtering the irrelevant attributes of the dataset, resulting in a fine Bayesian network built with the K2 algorithm. Empirical tests carried out with real data coming from the bioinformatics domain bear out that the PSO fitness function is in a straight concordance to the most widely known validation measures for classification.

[1]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[2]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[3]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[4]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[5]  Effects of microarray noise on inference efficiency of a stochastic model of gene networks , 2008 .

[6]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[7]  J. Kennedy,et al.  Matching algorithms to problems: an experimental test of the particle swarm and some genetic algorithms on the multimodal problem generator , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[8]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[9]  Lawrence Hunter,et al.  Artificial Intelligence and Molecular Biology , 1992, AI Mag..

[10]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[11]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .

[12]  James Kennedy,et al.  The particle swarm: social adaptation of knowledge , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[13]  Ian Witten,et al.  Data Mining , 2000 .

[14]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[15]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[16]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[17]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[18]  B. Larder,et al.  Mutations in RT and Protease III -93 NOV 95 Mutations in HIV-1 Reverse Transcriptase and Protease Associated with Drug Resistance , 1996 .

[19]  R. Sanchez,et al.  A genetic code Boolean structure. I. The meaning of Boolean deductions , 2005, Bulletin of mathematical biology.

[20]  B. De Moor,et al.  Genome-specific higher-order background models to improve motif detection. , 2003, Trends in microbiology.

[21]  David Page,et al.  A Bayesian Network Approach to Operon Prediction , 2003, Bioinform..

[22]  María del Carmen Chávez,et al.  Algunas aplicaciones de la estructura booleana del código genético , 2006 .

[23]  Remco R. Bouckaert,et al.  Bayesian network classifiers in Weka , 2004 .

[24]  Robersy Sanchez,et al.  The Genetic Code Boolean Lattice , 2004, q-bio/0412034.