Selective Naive Bayes for Regression Based on Mixtures of Truncated Exponentials

Naive Bayes models have been successfully used in classification problems where the class variable is discrete. These models have also been applied to regression or prediction problems, i.e. classification problems where the class variable is continuous, but usually under the assumption that the joint distribution of the feature variables and the class is multivariate Gaussian. In this paper we are interested in regression problems where some of the feature variables are discrete while the others are continuous. We propose a Naive Bayes predictor based on the approximation of the joint distribution by a Mixture of Truncated Exponentials (MTE). We have followed a filter-wrapper procedure for selecting the variables to be used in the construction of the model. This scheme is based on the mutual information between each of the candidate variables and the class. Since the mutual information can not be computed exactly for the MTE distribution, we introduce an unbiased estimator of it, based on Monte Carlo methods. We test the performance of the proposed model in artificial and real-world datasets.

[1]  Prakash P. Shenoy,et al.  Approximating Probability Density Functions with Mixtures of Truncated Exponentials , 2004 .

[2]  Serafín Moral,et al.  Mixtures of Truncated Exponentials in Hybrid Bayesian Networks , 2001, ECSQARU.

[3]  Serafín Moral,et al.  Estimating mixtures of truncated exponentials in hybrid bayesian networks , 2006 .

[4]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[5]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[6]  Ian H. Witten,et al.  Induction of model trees for predicting continuous classes , 1996 .

[7]  Rafael Rumí,et al.  Modeling Conditional Distributions of Continuous Variables in Bayesian Networks , 2005, IDA.

[8]  Thomas P. Hettmansperger,et al.  Department of Statistics , 2003 .

[9]  Leonard E. Trigg,et al.  Naive Bayes for regression , 1998 .

[10]  Rafael Rumí,et al.  Approximate probability propagation with mixtures of truncated exponentials , 2007, Int. J. Approx. Reason..

[11]  Rafael Rumí,et al.  Penniless Propagation with Mixtures of Truncated Exponentials , 2005, ECSQARU.

[12]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[13]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[15]  Prakash P. Shenoy Inference in Hybrid Bayesian Networks Using Mixtures of Gaussians , 2006, UAI.

[16]  Leonard E. Trigg,et al.  Technical Note: Naive Bayes for Regression , 2000, Machine Learning.

[17]  José A. Gámez,et al.  Predicción del valor genético en ovejas de raza manchega usando técnicas de aprendizaje automático , 2005 .

[18]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[19]  Ian Witten,et al.  Data Mining , 2000 .

[20]  Elvira: An Environment for Creating and Using Probabilistic Graphical Models , 2002, Probabilistic Graphical Models.

[21]  S. Lauritzen Propagation of Probabilities, Means, and Variances in Mixed Graphical Association Models , 1992 .

[22]  Pedro Larrañaga,et al.  Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes , 2006, Int. J. Approx. Reason..

[23]  Nir Friedman,et al.  Bayesian Network Classification with Continuous Attributes: Getting the Best of Both Discretization and Parametric Fitting , 1998, ICML.

[24]  Prakash P. Shenoy,et al.  Axioms for probability and belief-function proagation , 1990, UAI.

[25]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[26]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[27]  Ross D. Shachter,et al.  Mixtures of Gaussians and Minimum Relative Entropy Techniques for Modeling Continuous Uncertainties , 1993, UAI.

[28]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[29]  Rafael Rumí,et al.  Learning hybrid Bayesian networks using mixtures of truncated exponentials , 2006, Int. J. Approx. Reason..

[30]  David G. Stork,et al.  Pattern Classification , 1973 .

[31]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[32]  Prakash P. Shenoy,et al.  Approximating probability density functions in hybrid Bayesian networks with mixtures of truncated exponentials , 2006, Stat. Comput..