Using Bayesian graphical models to model biases in observational studies and to combine multiple data sources: Application to low birth-weight and water disinfection by-products

Data in the social, behavioral and health sciences frequently come from observational studies instead of controlled experiments. In addition to random errors, observational data typically contain additional sources of uncertainty such as missing values, unmeasured confounders, and selection biases. Also, due to the complicated nature of the research question, a single data set may not provide sufficient information for valid inference. As a result, multiple data sources are often necessary to identify the biases and inform about different aspects of the research question. Standard analyses of each data source separately may fail to capture uncertainty other than simple random errors, thus may produce misleading results. Therefore it becomes necessary to link together different sub-models for each source in a comprehensive way. Bayesian graphical models provide a coherent way to connect a series of local sub-models, based on different data sets, into a global unified analysis. In this manuscript, we present a unified modeling framework that will account for multiple biases simultaneously and give more accurate parameter estimates than standard approaches. We illustrate our approach by analyzing data from a study of water disinfection by-products and adverse birth outcomes in the U.K.

[1]  R. Kronmal,et al.  Assessing the sensitivity of regression results to unmeasured confounders in observational studies. , 1998, Biometrics.

[2]  Sylvia Richardson,et al.  Bayesian hierarchical models in ecological studies of health–environment effects , 2003 .

[3]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[4]  B Langholz,et al.  Factors that explain the power line configuration wiring code-childhood leukemia association: what would they look like? , 2001, Bioelectromagnetics.

[5]  Paul R. Rosenbaum,et al.  Design sensitivity in observational studies , 2004 .

[6]  Peter Green,et al.  Structure and uncertainty: Graphical models for understanding complex data , 2005 .

[7]  J. Wakefield,et al.  Modelling exposure to disinfection by-products in drinking water for an epidemiological study of adverse birth outcomes , 2005, Journal of Exposure Analysis and Environmental Epidemiology.

[8]  Sander Greenland,et al.  Multiple‐bias modelling for analysis of observational data , 2005 .

[9]  David J Spiegelhalter,et al.  Bayesian approaches to multiple sources of evidence and uncertainty in complex cost‐effectiveness modelling , 2003, Statistics in medicine.

[10]  S. Huttly,et al.  Comparison of the causes and consequences of prematurity and intrauterine growth retardation: a longitudinal study in southern Brazil. , 1992, Pediatrics.

[11]  P. Gustafson,et al.  Bayesian sensitivity analysis for unmeasured confounding in observational studies , 2007, Statistics in medicine.

[12]  M. Kramer Intrauterine growth and gestational duration determinants. , 1987, Pediatrics.

[13]  H. Joshi,et al.  The Millennium Cohort Study. , 2002, Population trends.

[14]  Sander Greenland,et al.  The Impact of Prior Distributions for Uncontrolled Confounding and Response Bias , 2003 .

[15]  David J. Spiegelhalter,et al.  Bayesian graphical modelling: a case‐study in monitoring health outcomes , 2002 .

[16]  G. Kalton,et al.  Handling missing data in survey research , 1996, Statistical methods in medical research.

[17]  Nicky Best,et al.  Relation of Trihalomethane Concentrations in Public Water Supplies to Stillbirth and Birth Weight in Three Water Regions in England , 2004, Environmental health perspectives.

[18]  Kiros Berhane,et al.  Bayesian modeling of air pollution health effects with missing exposure data. , 2006, American journal of epidemiology.

[19]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[20]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[21]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[22]  S. Richardson,et al.  Bayesian graphical models for regression on multiple data sets with different variables , 2008, Biostatistics.

[23]  J. Rook Formation of Haloforms during Chlorination of natural Waters , 1974 .

[24]  S. Chib,et al.  Analysis of multivariate probit models , 1998 .

[25]  M. Nieuwenhuijsen,et al.  Chlorination disinfection byproducts in water and their association with adverse reproductive outcomes: a review , 2000, Occupational and environmental medicine.