Decision boundary for discrete Bayesian network classifiers

Bayesian network classifiers are a powerful machine learning tool. In order to evaluate the expressive power of these models, we compute families of polynomials that sign-represent decision functions induced by Bayesian network classifiers. We prove that those families are linear combinations of products of Lagrange basis polynomials. In absence of V-structures in the predictor sub-graph, we are also able to prove that this family of polynomials does indeed characterize the specific classifier considered. We then use this representation to bound the number of decision functions representable by Bayesian network classifiers with a given structure.

[1]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  Mark A. Peot,et al.  Geometric Implications of the Naive Bayes Assumption , 1996, UAI.

[4]  Concha Bielza,et al.  Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[5]  Norbert Sauer,et al.  On the Density of Families of Sets , 1972, J. Comb. Theory, Ser. A.

[6]  Kenneth E. Iverson,et al.  A programming language , 1899, AIEE-IRE '62 (Spring).

[7]  Charles X. Ling,et al.  The Representational Power of Discrete Bayesian Networks , 2002, J. Mach. Learn. Res..

[8]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[9]  Dr. M. G. Worster Methods of Mathematical Physics , 1947, Nature.

[10]  Hans Ulrich Simon,et al.  Inner Product Spaces for Bayesian Networks , 2005, J. Mach. Learn. Res..

[11]  Giovanni Pistone,et al.  Gröbner bases and factorisation in discrete probability and Bayes , 2001, Stat. Comput..

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[14]  Chi Wang,et al.  The threshold order of a Boolean function , 1991, Discret. Appl. Math..

[15]  Manfred Jaeger,et al.  Probabilistic Classifiers and the Concepts They Recognize , 2003, ICML.

[16]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[17]  Geoffrey I. Webb,et al.  Adjusted Probability Naive Bayesian Induction , 1998, Australian Joint Conference on Artificial Intelligence.

[18]  Geoffrey I. Webb,et al.  Alleviating naive Bayes attribute independence assumption by attribute weighting , 2013, J. Mach. Learn. Res..

[19]  Yan Wu,et al.  On the properties of concept classes induced by multivalued Bayesian networks , 2012, Inf. Sci..

[20]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[21]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[22]  L. Flatto A NEW PROOF OF THE TRANSPOSITION THEOREM , 1970 .

[23]  Ryan O'Donnell,et al.  New degree bounds for polynomial threshold functions , 2010, Comb..

[24]  Eamonn J. Keogh,et al.  Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.