VC dimension and inner product space induced by Bayesian networks

Bayesian networks are graphical tools used to represent a high-dimensional probability distribution. They are used frequently in machine learning and many applications such as medical science. This paper studies whether the concept classes induced by a Bayesian network can be embedded into a low-dimensional inner product space. We focus on two-label classification tasks over the Boolean domain. For full Bayesian networks and almost full Bayesian networks with n variables, we show that VC dimension and the minimum dimension of the inner product space induced by them are 2^n-1. Also, for each Bayesian network N we show that VCdim(N)=Edim(N)=2^n^-^1+2^i if the network N^' constructed from N by removing X"n satisfies either (i) N^' is a full Bayesian network with n-1 variables, i is the number of parents of X"n, and i

[1]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[2]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[3]  Marc Boullé,et al.  Compression-Based Averaging of Selective Naive Bayes Classifiers , 2007, J. Mach. Learn. Res..

[4]  Gunnar Rätsch,et al.  A New Discriminative Kernel from Probabilistic Models , 2001, Neural Computation.

[5]  Bogdan Savchynskyy,et al.  Discriminative Learning of Max-Sum Classifiers , 2008, J. Mach. Learn. Res..

[6]  Hans Ulrich Simon,et al.  Inner Product Spaces for Bayesian Networks , 2005, J. Mach. Learn. Res..

[7]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[8]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[9]  Hans Ulrich Simon,et al.  Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces , 2004, Machine Learning.

[10]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[11]  Prakash P. Shenoy,et al.  Operations for inference in continuous Bayesian networks with linear deterministic variables , 2006, Int. J. Approx. Reason..

[12]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[13]  Luc De Raedt,et al.  Kernels on Prolog Proof Trees: Statistical Learning in the ILP Setting , 2006, J. Mach. Learn. Res..

[14]  John D. Lafferty,et al.  Diffusion Kernels on Statistical Manifolds , 2005, J. Mach. Learn. Res..

[15]  Motoaki Kawanabe,et al.  Asymptotic Properties of the Fisher Kernel , 2004, Neural Computation.

[16]  Linda C. van der Gaag,et al.  Learning Bayesian network parameters under order constraints , 2006, Int. J. Approx. Reason..

[17]  Shai Ben-David,et al.  Limitations of Learning Via Embeddings in Euclidean Half Spaces , 2003, J. Mach. Learn. Res..