Tree Augmented Naive Bayes for Regression Using Mixtures of Truncated Exponentials: Application to Higher Education Management

In this paper we explore the use of Tree Augmented Naive Bayes (TAN) in regression problems where some of the independent variables are continuous and some others are discrete. The proposed solution is based on the approximation of the joint distribution by a Mixture of Truncated Exponentials (MTE). The construction of the TAN structure requires the use of the conditional mutual information, which cannot be analytically obtained for MTEs. In order to solve this problem, we introduce an unbiased estimator of the conditional mutual information, based on Monte Carlo estimation. We test the performance of the proposed model in a real life context, related to higher education management, where regression problems with discrete and continuous variables are common.

[1]  Joost N. Kok,et al.  Advances in Intelligent Data Analysis VI, 6th International Symposium on Intelligent Data Analysis, IDA 2005, Madrid, Spain, September 8-10, 2005, Proceedings , 2005, IDA.

[2]  Serafín Moral,et al.  Mixtures of Truncated Exponentials in Hybrid Bayesian Networks , 2001, ECSQARU.

[3]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[4]  Rafael Rumí,et al.  Learning hybrid Bayesian networks using mixtures of truncated exponentials , 2006, Int. J. Approx. Reason..

[5]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[6]  Prakash P. Shenoy,et al.  Approximating Probability Density Functions with Mixtures of Truncated Exponentials , 2004 .

[7]  Serafín Moral,et al.  Estimating mixtures of truncated exponentials in hybrid bayesian networks , 2006 .

[8]  Pedro Larrañaga,et al.  Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes , 2006, Int. J. Approx. Reason..

[9]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[10]  Michael Clarke,et al.  Symbolic and Quantitative Approaches to Reasoning and Uncertainty , 1991, Lecture Notes in Computer Science.

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12]  José A. Gámez,et al.  Predicción del valor genético en ovejas de raza manchega usando técnicas de aprendizaje automático , 2005 .

[13]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[14]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[15]  Rafael Rumí,et al.  Modeling Conditional Distributions of Continuous Variables in Bayesian Networks , 2005, IDA.

[16]  Antonio Salmerón,et al.  Selective Naive Bayes Predictor with Mixtures of Truncated Exponentials , 2006 .

[17]  Leonard E. Trigg,et al.  Technical Note: Naive Bayes for Regression , 2000, Machine Learning.

[18]  Prakash P. Shenoy,et al.  Approximating probability density functions in hybrid Bayesian networks with mixtures of truncated exponentials , 2006, Stat. Comput..

[19]  David G. Stork,et al.  Pattern Classification , 1973 .

[20]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[21]  Serafín Moral,et al.  Approximating Conditional MTE Distributions by Means of Mixed Trees , 2003, ECSQARU.

[22]  Rafael Rumí,et al.  Approximate probability propagation with mixtures of truncated exponentials , 2007, Int. J. Approx. Reason..