On the Number of Samples Needed to Learn the Correct Structure of a Bayesian Network

Bayesian Networks (BNs) are useful tools giving a natural and compact representation of joint probability distributions. In many applications one needs to learn a Bayesian Network (BN) from data. In this context, it is important to understand the number of samples needed in order to guarantee a successful learning. Previous work have studied BNs sample complexity, yet it mainly focused on the requirement that the learned distribution will be close to the original distribution which generated the data. In this work, we study a different aspect of the learning, namely the number of samples needed in order to learn the correct structure of the network. We give both asymptotic results, valid in the large sample limit, and experimental results, demonstrating the learning behavior for feasible sample sizes. We show that structure learning is a more difficult task, compared to approximating the correct distribution, in the sense that it requires a much larger number of samples, regardless of the computational power available for the learner.

[1]  D. Geiger,et al.  Stratified exponential families: Graphical models and model selection , 2001 .

[2]  Pieter Abbeel,et al.  Learning Factor Graphs in Polynomial Time and Sample Complexity , 2006, J. Mach. Learn. Res..

[3]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[4]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[5]  I. N. Sanov On the probability of large deviations of random variables , 1958 .

[6]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[8]  Xindong Wu,et al.  A Study of Causal Discovery With Weak Links and Small Samples , 1997, IJCAI.

[9]  M. Woodroofe On Model Selection and the ARC Sine Laws , 1982 .

[10]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[11]  Sanjoy Dasgupta,et al.  The Sample Complexity of Learning Fixed-Structure Bayesian Networks , 1997, Machine Learning.

[12]  Nir Friedman,et al.  On the Sample Complexity of Learning Bayesian Networks , 1996, UAI.

[13]  Dale Schuurmans,et al.  Learning Bayesian Nets that Perform Well , 1997, UAI.

[14]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[15]  D. Haughton On the Choice of a Model to Fit Data from an Exponential Family , 1988 .

[16]  Pieter Abbeel,et al.  Learning Factor Graphs in Polynomial Time & Sample Complexity , 2005, UAI.

[17]  Michael Woodroofe,et al.  Large Deviations of Likelihood Ratio Statistics with Applications to Sequential Testing , 1978 .

[18]  Klaus-Uwe Höffgen,et al.  Learning and robust learning of product distributions , 1993, COLT '93.