bnclassify: Learning Bayesian Network Classifiers

The bnclassify package provides state-of-the art algorithms for learning Bayesian network classifiers from data. For structure learning it provides variants of the greedy hill-climbing search, a well-known adaptation of the Chow-Liu algorithm and averaged one-dependence estimators. It provides Bayesian and maximum likelihood parameter estimation, as well as three naive-Bayes-specific methods based on discriminative score optimization and Bayesian model averaging. The implementation is efficient enough to allow for time-consuming discriminative scores on medium-sized data sets. bnclassify provides utilities for model evaluation, such as cross-validated accuracy and penalized log-likelihood scores, and analysis of the underlying networks, including network plotting via the Rgraphviz package. It is extensively tested, with over 200 automated tests that give a code coverage of 94%. Here we present the main functionalities, illustrate them with a number of data sets, and comment on related software.

[1]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  M. Pazzani Constructive Induction of Cartesian Product Attributes , 1998 .

[4]  Krzysztof J. Cios,et al.  Bayesian learning for cardiac SPECT image interpretation , 2002, Artif. Intell. Medicine.

[5]  Geoffrey I. Webb,et al.  Efficient parameter learning of Bayesian network classifiers , 2017, Machine Learning.

[6]  Gregory F. Cooper,et al.  Exact model averaging with naive Bayesian classifiers , 2002, ICML.

[7]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[8]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[9]  Franz Pernkopf,et al.  Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers , 2010, J. Mach. Learn. Res..

[10]  Eamonn J. Keogh,et al.  Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.

[11]  Manfred Jaeger,et al.  Probabilistic Classifiers and the Concepts They Recognize , 2003, ICML.

[12]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[13]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[14]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[15]  Scott T. Weiss,et al.  CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data , 2014, PLoS Comput. Biol..

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[17]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[18]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[19]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[20]  Søren Højsgaard,et al.  Graphical Independence Networks with the gRain Package for R , 2012 .

[21]  Manuel Laguna,et al.  Tabu Search , 1997 .

[22]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[23]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[24]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[25]  Concha Bielza,et al.  Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[26]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[27]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[28]  Susanne Bottcher,et al.  Learning Bayesian networks with mixed variables , 2001, AISTATS.

[29]  Remco R. Bouckaert,et al.  Bayesian network classifiers in Weka , 2004 .

[30]  Geoffrey I. Webb,et al.  Alleviating naive Bayes attribute independence assumption by attribute weighting , 2013, J. Mach. Learn. Res..

[31]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[32]  Franz Pernkopf,et al.  Floating search algorithm for structure learning of Bayesian network classifiers , 2003, Pattern Recognit. Lett..

[33]  Teemu Roos,et al.  Discriminative Learning of Bayesian Networks via Factorized Conditional Log-Likelihood , 2011, J. Mach. Learn. Res..

[34]  Constantin F. Aliferis,et al.  Towards Principled Feature Selection: Relevancy, Filters and Wrappers , 2003 .

[35]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[36]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[37]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[38]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[39]  Geoffrey I. Webb,et al.  Efficient lazy elimination for averaged one-dependence estimators , 2006, ICML.

[40]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[41]  Concha Bielza,et al.  Decision boundary for discrete Bayesian network classifiers , 2015, J. Mach. Learn. Res..

[42]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[43]  R. Bouckaert Bayesian belief networks : from construction to inference , 1995 .

[44]  H. Akaike A new look at the statistical model identification , 1974 .

[45]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.