Bayesian network approach to multinomial parameter learning using data and expert judgments

Abstract One of the hardest challenges in building a realistic Bayesian Network (BN) model is to construct the node probability tables (NPTs). Even with a fixed predefined model structure and very large amounts of relevant data, machine learning methods do not consistently achieve great accuracy compared to the ground truth when learning the NPT entries (parameters). Hence, it is widely believed that incorporating expert judgments can improve the learning process. We present a multinomial parameter learning method, which can easily incorporate both expert judgments and data during the parameter learning process. This method uses an auxiliary BN model to learn the parameters of a given BN. The auxiliary BN contains continuous variables and the parameter estimation amounts to updating these variables using an iterative discretization technique. The expert judgments are provided in the form of constraints on parameters divided into two categories: linear inequality constraints and approximate equality constraints. The method is evaluated with experiments based on a number of well-known sample BN models (such as Asia, Alarm and Hailfinder) as well as a real-world software defects prediction BN model. Empirically, the new method achieves much greater learning accuracy (compared to both state-of-the-art machine learning techniques and directly competing methods) with much less data. For example, in the software defects BN for a sample size of 20 (which would be considered difficult to collect in practice) when a small number of real expert constraints are provided, our method achieves a level of accuracy in parameter estimation that can only be matched by other methods with much larger sample sizes (320 samples required for the standard machine learning method, and 105 for the directly competing method with constraints).

[1]  Helge Langseth,et al.  Fast Approximate Inference in Hybrid Bayesian Networks Using Dynamic Discretisation , 2013, IWINAC.

[2]  William Marsh,et al.  On the effectiveness of early life cycle defect prediction with Bayesian Nets , 2008, Empirical Software Engineering.

[3]  Stan Matwin,et al.  Discriminative parameter learning for Bayesian networks , 2008, ICML '08.

[4]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[5]  Tom M. Mitchell,et al.  Bayesian Network Learning with Parameter Constraints , 2006, J. Mach. Learn. Res..

[6]  Norman Fenton,et al.  Risk Assessment and Decision Analysis with Bayesian Networks , 2012 .

[7]  Marco Zaffalon,et al.  Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification , 2012 .

[8]  Kevin P. Murphy,et al.  Inference and Learning in Hybrid Bayesian Networks , 1998 .

[9]  Oluwole Victor Ogunsanya,et al.  Decision support using Bayesian networks for clinical decision making , 2012 .

[10]  T. Cover,et al.  Entropy, Relative Entropy and Mutual Information , 2001 .

[11]  Linda C. van der Gaag,et al.  Designing a Procedure for the Acquisition of Probability Constraints for Bayesian Networks , 2004, EKAW.

[12]  Norman E. Fenton,et al.  Using Ranked Nodes to Model Qualitative Judgments in Bayesian Networks , 2007, IEEE Transactions on Knowledge and Data Engineering.

[13]  Jeremy E. Oakley,et al.  Uncertain Judgements: Eliciting Experts' Probabilities , 2006 .

[14]  Qiang Ji,et al.  Improving Bayesian Network parameter learning using constraints , 2008, 2008 19th International Conference on Pattern Recognition.

[15]  Andrés Cano,et al.  A Method for Integrating Expert Knowledge When Learning Bayesian Networks From Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Eugene Santos,et al.  Fusing multiple Bayesian knowledge sources , 2011, Int. J. Approx. Reason..

[17]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[18]  Rui Chang,et al.  Quantitative Inference by Qualitative Semantic Knowledge Mining with Bayesian Model Averaging , 2008, IEEE Transactions on Knowledge and Data Engineering.

[19]  Rui Chang,et al.  Novel algorithm for Bayesian network parameter learning with informative prior constraints , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[20]  Qiang Ji,et al.  Learning Bayesian network parameters under incomplete data with domain knowledge , 2009, Pattern Recognit..

[21]  Thomas G. Dietterich,et al.  Learning from Sparse Data by Exploiting Monotonicity Constraints , 2005, UAI.

[22]  Pascal Poupart,et al.  Automated Refinement of Bayes Networks' Parameters based on Test Ordering Constraints , 2011, NIPS.

[23]  Linda C. van der Gaag,et al.  Learning Bayesian network parameters under order constraints , 2006, Int. J. Approx. Reason..

[24]  Olivier Pourret,et al.  Bayesian networks : a practical guide to applications , 2008 .

[25]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[26]  Norman E. Fenton,et al.  Optimizing the Calculation of Conditional Probability Tables in Hybrid Bayesian Networks Using Binary Factorization , 2012, IEEE Transactions on Knowledge and Data Engineering.

[27]  Matthew Richardson,et al.  Learning with Knowledge from Multiple Experts , 2003, ICML.

[28]  Martin Neil,et al.  Inference in hybrid Bayesian networks using dynamic discretization , 2007, Stat. Comput..