A Theoretical Framework for Learning Bayesian Networks with Parameter Inequality Constraints

The task of learning models for many real-world problems requires incorporating domain knowledge into learning algorithms, to enable accurate learning from a realistic volume of training data. Domain knowledge can come in many forms. For example, expert knowledge about the relevance of variables relative to a certain problem can help perform better feature selection. Domain knowledge about the conditional independence relationships among variables can help learning of the Bayesian Network structure. This paper considers a different type of domain knowledge for constraining parameter estimates when learning Bayesian Networks. In particular, we consider domain knowledge that comes in the form of inequality constraints among subsets of parameters in a Bayesian Network with known structure. These parameter constraints are incorporated into learning procedures for Bayesian Networks, by formulating this task as a constrained optimization problem. The main contribution of this paper is the derivation of closed form Maximum Likelihood parameter estimators in the above setting.

[1]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[2]  Thomas G. Dietterich,et al.  Learning from Sparse Data by Exploiting Monotonicity Constraints , 2005, UAI.

[3]  HeckermanDavid,et al.  Knowledge representation and inference in similarity networks and Bayesian multinets , 1996 .

[4]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[5]  Avi Pfeffer,et al.  Object-Oriented Bayesian Networks , 1997, UAI.

[6]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[8]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[9]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[10]  Tom M. Mitchell,et al.  Exploiting Parameter Related Domain Knowledge for Learning in Graphical Models , 2005, SDM.

[11]  Linda C. van der Gaag,et al.  Learning Bayesian network parameters under order constraints , 2006, Int. J. Approx. Reason..

[12]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[13]  David Heckerman,et al.  Knowledge Representation and Inference in Similarity Networks and Bayesian Multinets , 1996, Artif. Intell..

[14]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[15]  Michael C. Horsch,et al.  Dynamic Bayesian networks , 1990 .

[16]  Greg Welch,et al.  Welch & Bishop , An Introduction to the Kalman Filter 2 1 The Discrete Kalman Filter In 1960 , 1994 .

[17]  Jeff A. Bilmes,et al.  Dynamic Bayesian Multinets , 2000, UAI.

[18]  W. Press,et al.  Numerical Recipes in Fortran: The Art of Scientific Computing.@@@Numerical Recipes in C: The Art of Scientific Computing. , 1994 .

[19]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[20]  D. Geiger,et al.  A characterization of the Dirichlet distribution through global and local parameter independence , 1997 .

[21]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[22]  Nir Friedman,et al.  Learning Module Networks , 2002, J. Mach. Learn. Res..

[23]  Pedro Larrañaga,et al.  Learning Recursive Bayesian Multinets for Data Clustering by Means of Constructive Induction , 2002, Machine Learning.