A Bayesian hierarchical score for structure learning from related data sets

Score functions for learning the structure of Bayesian networks in the literature assume that data are a homogeneous set of observations; whereas it is often the case that they comprise different related, but not homogeneous, data sets collected in different ways. In this paper we propose a new Bayesian Dirichlet score, which we call Bayesian Hierarchical Dirichlet (BHD). The proposed score is based on a hierarchical model that pools information across data sets to learn a single encompassing network structure, while taking into account the differences in their probabilistic structures. We derive a closed-form expression for BHD using a variational approximation of the marginal likelihood, we study the associated computational cost and we evaluate its performance using simulated data. We find that, when data comprise multiple related data sets, BHD outperforms the Bayesian Dirichlet equivalent uniform (BDeu) score in terms of reconstruction accuracy as measured by the Structural Hamming distance, and that it is as accurate as BDeu when data are homogeneous. This improvement is particularly clear when either the number of variables in the network or the number of observations is large. Moreover, the estimated networks are sparser and therefore more interpretable than those obtained with BDeu thanks to a lower number of false positive arcs.

[1]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[2]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[3]  M. Calus,et al.  An Equation to Predict the Accuracy of Genomic Values by Combining Data from Multiple Traits, Populations, or Environments , 2015, Genetics.

[4]  Maomi Ueno,et al.  Learning networks determined by the ratio of prior and data , 2010, UAI.

[5]  George Casella,et al.  Assessing Robustness of Intrinsic Tests of Independence in Two-Way Contingency Tables , 2009 .

[6]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[7]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[8]  Tomi Silander,et al.  On Sensitivity of the MAP Bayesian Network Structure to the Equivalent Sample Size Parameter , 2007, UAI.

[9]  Elias Bareinboim,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[10]  Marco Scutari,et al.  An Empirical-Bayes Score for Discrete Bayesian Networks , 2016, Probabilistic Graphical Models.

[11]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[12]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[13]  Jim Q. Smith,et al.  Exact estimation of multiple directed acyclic graphs , 2014, Stat. Comput..

[14]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[15]  D. Allison,et al.  Beyond Missing Heritability: Prediction of Complex Traits , 2011, PLoS genetics.

[16]  Riccardo Bellazzi,et al.  Hierarchical Naive Bayes for genetic association studies , 2012, BMC Bioinformatics.

[17]  M. Goddard Genomic selection: prediction of accuracy and maximisation of long term response , 2009, Genetica.

[18]  Gerard McMahon,et al.  On the application of multilevel modeling in environmental and ecological studies. , 2010, Ecology.

[19]  M. A. Best Bayesian Approaches to Clinical Trials and Health‐Care Evaluation , 2005 .

[20]  Marco Zaffalon,et al.  Hierarchical estimation of parameters in Bayesian networks , 2019, Comput. Stat. Data Anal..

[21]  R. Fernando,et al.  Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor , 2013, PLoS genetics.

[22]  Michael I. Jordan Graphical Models , 2003 .

[23]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[24]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[25]  Rich Caruana,et al.  Inductive Transfer for Bayesian Network Structure Learning , 2007, ICML Unsupervised and Transfer Learning.

[26]  Robert E. Tillman,et al.  Structure learning with independent non-identically distributed data , 2009, ICML '09.

[27]  Allan Tucker,et al.  Modeling Air Pollution, Climate, and Health Data Using Bayesian Networks: A Case Study of the English Regions , 2018 .

[28]  David Maxwell Chickering,et al.  Learning Bayesian networks: The combination of knowledge and statistical data , 1995, Mach. Learn..

[29]  R. Gray A Bayesian analysis of institutional effects in a multicenter cancer clinical trial. , 1994, Biometrics.

[30]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[31]  Marco Scutari,et al.  Dirichlet Bayesian network scores and the maximum relative entropy principle , 2017, Behaviormetrika.

[32]  Riccardo Bellazzi,et al.  A hierarchical Naïve Bayes Model for handling sample heterogeneity in classification problems: an application to tissue microarrays , 2006, BMC Bioinformatics.

[33]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[34]  Maged Ali,et al.  A Spatial Survey of Environmental Indicators for Kazakhstan: An Examination of Current Conditions and Future Needs , 2018, International Journal of Environmental Research.