K-Dependence Bayesian Classifier Ensemble

To maximize the benefit that can be derived from the information implicit in big data, ensemble methods generate multiple models with sufficient diversity through randomization or perturbation. A k-dependence Bayesian classifier (KDB) is a highly scalable learning algorithm with excellent time and space complexity, along with high expressivity. This paper introduces a new ensemble approach of KDBs, a k-dependence forest (KDF), which induces a specific attribute order and conditional dependencies between attributes for each subclassifier. We demonstrate that these subclassifiers are diverse and complementary. Our extensive experimental evaluation on 40 datasets reveals that this ensemble method achieves better classification performance than state-of-the-art out-of-core ensemble learners such as the AODE (averaged one-dependence estimator) and averaged tree-augmented naive Bayes (ATAN).

[1]  Paulo Mateus,et al.  Efficient Approximation of the Conditional Relative Entropy with Applications to Discriminative Learning of Bayesian Network Classifiers , 2013, Entropy.

[2]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[3]  Geoffrey I. Webb,et al.  Scalable Learning of Bayesian Network Classifiers , 2016, J. Mach. Learn. Res..

[4]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[5]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[6]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[7]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[8]  Liangxiao Jiang,et al.  Improving Tree augmented Naive Bayes for class probability estimation , 2012, Knowl. Based Syst..

[9]  Geoffrey I. Webb,et al.  Selective AnDE for large data learning: a low-bias memory constrained approach , 2017, Knowledge and Information Systems.

[10]  Concha Bielza,et al.  Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[11]  Jia Wu,et al.  A naive Bayes probability estimation model based on self-adaptive differential evolution , 2013, Journal of Intelligent Information Systems.

[12]  Geoffrey I. Webb,et al.  Sample-Based Attribute Selective A$n$ DE for Large Data , 2017, IEEE Transactions on Knowledge and Data Engineering.

[13]  Concha Bielza,et al.  Decision boundary for discrete Bayesian network classifiers , 2015, J. Mach. Learn. Res..

[14]  Kewei Tu,et al.  Learning Bayesian network structures under incremental construction curricula , 2017, Neurocomputing.

[15]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[16]  Duc Truong Pham,et al.  Building Bayesian network classifiers through a Bayesian complexity monitoring system , 2009 .

[17]  Geoffrey I. Webb,et al.  Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification , 2011, Machine Learning.

[18]  Harry Zhang,et al.  Full Bayesian network classifiers , 2006, ICML.

[19]  Franz Pernkopf,et al.  Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers , 2010, J. Mach. Learn. Res..

[20]  Dinggang Shen,et al.  Learning Discriminative Bayesian Networks from High-Dimensional Continuous Neuroimaging Data , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[22]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[23]  Michael G. Madden,et al.  On the classification performance of TAN and general Bayesian networks , 2008, Knowl. Based Syst..

[24]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[25]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[26]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[27]  Yu-Lin He,et al.  Bayesian classifiers based on probability density estimation and their applications to simultaneous fault diagnosis , 2014, Inf. Sci..

[28]  Rui Gao,et al.  Bayesian network classifiers based on Gaussian kernel density , 2016, Expert Syst. Appl..

[29]  Sebastian Tschiatschek,et al.  Maximum Margin Bayesian Network Classifiers , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Anderson Ara,et al.  Bagging k-dependence probabilistic networks: An alternative powerful fraud detection tool , 2012, Expert Syst. Appl..

[31]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..