Interval Estimation for Messy Observational Data

We review some aspects of Bayesian and frequentist interval estimation, focusing first on their relative strengths and weaknesses when used in "clean" or "textbook" contexts. We then turn attention to observational-data situations which are "messy," where modeling that acknowledges the limitations of study design and data collection leads to nonidentifiability. We argue, via a series of examples, that Bayesian interval estimation is an attractive way to proceed in this context even for frequentists, because it can be supplied with a diagnostic in the form of a calibration-sensitivity simulation analysis. We illustrate the basis for this approach in a series of theoretical considerations, simulations and an application to a study of silica exposure and lung cancer.

[1]  Sander Greenland,et al.  Multiple‐bias modelling for analysis of observational data , 2005 .

[2]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[3]  P. Gustafson Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments , 2003 .

[4]  Stephen B. Vardeman,et al.  Bayes and admissible set estimation , 1985 .

[5]  Sander Greenland,et al.  Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources , 2009, 1001.2685.

[6]  Measurement Error and Misclassification , 2007 .

[7]  Sylvia Richardson,et al.  Using Bayesian graphical models to model biases in observational studies and to combine multiple sources of data: application to low birth weight and water disinfection by‐products , 2009 .

[8]  Edward E. Leamer,et al.  False Models and Post-Data Model Construction , 1974 .

[9]  Paul Gustafson,et al.  The utility of prior information and stratification for parameter estimation with two screening tests but no gold standard , 2005, Statistics in medicine.

[10]  A. Noordhof,et al.  In the absence of a gold standard , 2010 .

[11]  Nuoo-Ting Jassy Using Bayesian graphical models to model biases in observational studies and to combine multiple data sources : Application to low birthweight and water disinfection by-products , 2008 .

[12]  David M. Eddy,et al.  Meta-analysis by the confidence profile method , 1992 .

[13]  H. Uno,et al.  The Optimal Confidence Region for a Random Parameter , 2005 .

[14]  Geert Molenberghs,et al.  Ignorance and uncertainty regions as inferential tools in a sensitivity analysis , 2006 .

[15]  James O. Berger,et al.  The interplay of Bayesian and frequentist analysis , 2004 .

[16]  Lawrence C McCandless,et al.  A sensitivity analysis using information about measured confounders yielded improved uncertainty assessments for unmeasured confounding. , 2008, Journal of clinical epidemiology.

[17]  Marcello Pagano,et al.  On the informativeness and accuracy of pooled testing in estimating prevalence of a rare disease: Application to HIV screening , 1995 .

[18]  P. Gustafson On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables , 2005 .

[19]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[20]  S Greenland,et al.  Sensitivity Analysis, Monte Carlo Risk Analysis, and Bayesian Uncertainty Assessment , 2001, Risk analysis : an official publication of the Society for Risk Analysis.

[21]  C. Robert The Bayesian choice : a decision-theoretic motivation , 1996 .

[22]  L. Joseph,et al.  Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. , 1995, American journal of epidemiology.

[23]  S Greenland,et al.  A Pooled Analysis of Magnetic Fields, Wire Codes, and Childhood Leukemia , 2000, Epidemiology.

[24]  Sander Greenland,et al.  The Impact of Prior Distributions for Uncontrolled Confounding and Response Bias , 2003 .

[25]  Sander Greenland,et al.  The Performance of Random Coefficient Regression in Accounting for Residual Confounding , 2006, Biometrics.

[26]  P Gustafson,et al.  Case–Control Analysis with Partial Knowledge of Exposure Misclassification Probabilities , 2001, Biometrics.

[27]  David J Spiegelhalter,et al.  Bias modelling in evidence synthesis , 2009, Journal of the Royal Statistical Society. Series A,.

[28]  Sander Greenland,et al.  Curious phenomena in Bayesian adjustment for exposure misclassification , 2006, Statistics in medicine.

[29]  P. Gustafson,et al.  Bayesian sensitivity analysis for unmeasured confounding in observational studies , 2007, Statistics in medicine.

[30]  J. Neyman,et al.  Frequentist probability and frequentist statistics , 1977, Synthese.

[31]  Zhiwei Zhang,et al.  Likelihood-based confidence sets for partially identified parameters , 2009 .

[32]  C. Morris Parametric Empirical Bayes Inference: Theory and Applications , 1983 .

[33]  Paul Gustafson,et al.  Sample size implications when biases are modelled rather than ignored , 2006 .

[34]  Donald B. Rubin,et al.  Validation of Software for Bayesian Models Using Posterior Quantiles , 2006 .

[35]  Charles F. Manski,et al.  Confidence Intervals for Partially Identified Parameters , 2003 .

[36]  Donald B. Rubin,et al.  Efficiently Simulating the Coverage Properties of Interval Estimates , 1986 .

[37]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[38]  S Greenland,et al.  Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analyses. , 1997, Statistics in medicine.

[39]  X M Tu,et al.  Studies of AIDS and HIV surveillance. Screening tests: can we get more by doing less? , 1994, Statistics in medicine.

[40]  L. Brown In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies , 2008, 0803.3697.

[41]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[42]  Sander Greenland,et al.  Bias Analysis , 2011, International Encyclopedia of Statistical Science.

[43]  Daniel O Scharfstein,et al.  Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. , 2003, Biostatistics.

[44]  Sander Greenland,et al.  Sensitivity analysis of misclassification: a graphical and a Bayesian approach. , 2006, Annals of epidemiology.

[45]  Christina Kendziorski,et al.  Parametric Empirical Bayes Methods for Microarrays , 2003 .

[46]  Sander Greenland,et al.  Monte Carlo sensitivity analysis and Bayesian analysis of smoking as an unmeasured confounder in a study of silica and lung cancer. , 2004, American journal of epidemiology.

[47]  L. Joseph,et al.  Bayesian Approaches to Modeling the Conditional Dependence Between Multiple Diagnostic Tests , 2001, Biometrics.

[48]  Russell V. Lenth,et al.  Statistical Analysis With Missing Data (2nd ed.) (Book) , 2004 .