Approximation Methods for Efficient Learning of Bayesian Networks

This publication offers and investigates efficient Monte Carlo simulation methods in order to realize a Bayesian approach to approximate learning of Bayesian networks from both complete and incomplete data. For large amounts of incomplete data when Monte Carlo methods are inefficient, approximations are implemented, such that learning remains feasible, albeit non-Bayesian. Topics discussed are; basic concepts about probabilities, graph theory and conditional independence; Bayesian network learning from data; Monte Carlo simulation techniques; and the concept of incomplete data. In order to provide a coherent treatment of matters, thereby helping the reader to gain a thorough understanding of the whole concept of learning Bayesian networks from (in)complete data, this publication combines in a clarifying way all the issues presented in the papers with previously unpublished work.IOS Press is an international science, technical and medical publisher of high-quality books for academics, scientists, and professionals in all fields. Some of the areas we publish in: -Biomedicine -Oncology -Artificial intelligence -Databases and information systems -Maritime engineering -Nanotechnology -Geoengineering -All aspects of physics -E-governance -E-commerce -The knowledge economy -Urban studies -Arms control -Understanding and responding to terrorism -Medical informatics -Computer Sciences

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[3]  Steffen L. Lauritzen,et al.  Bayesian updating in causal probabilistic networks by local computations , 1990 .

[4]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[5]  James Cussens,et al.  Markov Chain Monte Carlo using Tree-Based Priors on Model Structure , 2001, UAI.

[6]  P. Green,et al.  Decomposable graphical Gaussian model determination , 1999 .

[7]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[8]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[9]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[10]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[11]  J.P.L. Brand,et al.  Development, Implementation and Evaluation of Multiple Imputation Strategies for the Statistical Analysis of Incomplete Data Sets , 1999 .

[12]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[13]  Carsten Riggelsen,et al.  Learning parameters of Bayesian networks from incomplete data via importance sampling , 2006, Int. J. Approx. Reason..

[14]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[15]  V. Didelez,et al.  Maximum likelihood estimation in graphical models with missing values , 1998 .

[16]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[17]  Nir Friedman,et al.  Being Bayesian about Network Structure , 2000, UAI.

[18]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[19]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[20]  Jerry Nedelman,et al.  Book review: “Bayesian Data Analysis,” Second Edition by A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin Chapman & Hall/CRC, 2004 , 2005, Comput. Stat..

[21]  Robert Castelo,et al.  Improved learning of Bayesian networks , 2001, UAI.

[22]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[23]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[24]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[25]  J. Geweke,et al.  Bayesian Inference in Econometric Models Using Monte Carlo Integration , 1989 .

[26]  Judea Pearl,et al.  The Logic of Representing Dependencies by Directed Graphs , 1987, AAAI.

[27]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[28]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[29]  Uffe Kjærulff,et al.  Blocking Gibbs sampling in very large probabilistic expert systems , 1995, Int. J. Hum. Comput. Stud..

[30]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[31]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[32]  Robert Castelo,et al.  On Inclusion-Driven Learning of Bayesian Networks , 2003, J. Mach. Learn. Res..

[33]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[34]  G. Roberts,et al.  Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler , 1997 .

[35]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[36]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[37]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[38]  Paola Sebastiani,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Robust Learning with Missing Data , 2022 .

[39]  David Maxwell Chickering,et al.  Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables , 1997, Machine Learning.

[40]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[41]  A. J. Feelders,et al.  Learning Bayesian Network Models from Incomplete Data using Importance Sampling , 2005, AISTATS.

[42]  Paola Sebastiani,et al.  Parameter Estimation in Bayesian Networks from Incomplete Databases , 1998, Intell. Data Anal..

[43]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[44]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[45]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[46]  Carsten Riggelsen MCMC Learning of Bayesian Network Models by Markov Blanket Decomposition , 2005, ECML.

[47]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[48]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[49]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[50]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[51]  Juan Roberto Castelo Valdueza,et al.  The Discrete Acyclic Digraph Markov Model in Data Mining , 2002 .

[52]  D. Madigan,et al.  A characterization of Markov equivalence classes for acyclic digraphs , 1997 .

[53]  Paola Sebastiani,et al.  Learning Bayesian Networks from Incomplete Databases , 1997, UAI.

[54]  Moninder Singh,et al.  Learning Bayesian Networks from Incomplete Data , 1997, AAAI/IAAI.

[55]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[56]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[57]  Tommi S. Jaakkola,et al.  On the Dirichlet Prior and Bayesian Regularization , 2002, NIPS.

[58]  Stuart J. Russell,et al.  Adaptive Probabilistic Networks with Hidden Variables , 1997, Machine Learning.

[59]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  David J. Spiegelhalter,et al.  Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[61]  D. Edwards,et al.  A fast procedure for model search in multidimensional contingency tables , 1985 .

[62]  Anne Lohrli Chapman and Hall , 1985 .

[63]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms: A Performance Analysis of Control Parameters , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Michael D. Perlman,et al.  Enumerating Markov Equivalence Classes of Acyclic Digraph Models , 2001, UAI.

[65]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[66]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[67]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[68]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[69]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[70]  Steffen L. Lauritzen,et al.  Independence properties of directed markov fields , 1990, Networks.

[71]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[72]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[73]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[74]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[75]  T. Hesterberg,et al.  Weighted Average Importance Sampling and Defensive Mixture Distributions , 1995 .

[76]  J. N. R. Jeffers,et al.  Graphical Models in Applied Multivariate Statistics. , 1990 .

[77]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[78]  José M. Peña,et al.  On Local Optima in Learning Bayesian Networks , 2003, UAI.

[79]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[80]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[81]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[82]  D. Goldberg,et al.  BOA: the Bayesian optimization algorithm , 1999 .

[83]  T. J. Sweeting,et al.  Prequential test of model fit , 1992 .

[84]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[85]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[86]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[87]  R. Bouckaert Bayesian belief networks : from construction to inference , 1995 .

[88]  H. Akaike A new look at the statistical model identification , 1974 .

[89]  Robert G. Cowell,et al.  Mixture reduction via predictive scores , 1998, Stat. Comput..

[90]  Carsten Riggelsen,et al.  Learning Bayesian Networks from Incomplete Data: An Efficient Method for Generating Approximate Predictive Distributions , 2006, SDM.