Bayesian Restricted Likelihood Methods

Bayesian methods have proven themselves to be successful across a wide range of scientific problems and have many well-documented advantages over competing methods. However, these methods run into difficulties for two major and prevalent classes of problems: handling data sets with outliers and dealing with model misspecification. This dissertation introduces the restricted likelihood as a solution to these problems. When working with the restricted likelihood, we summarize the data through a set of (insufficient) statistics, targeting inferential quantities of interest, and update the prior distribution with the summary statistics rather than the complete data. By choice of conditioning statistics, we retain the main benefits of Bayesian methods while reducing the sensitivity of the analysis to features of the data not picked up by the conditioning statistics. The method is general, but this dissertation concentrates on applying the restricted likelihood to the common Bayesian linear model when outliers are of concern. For conditioning statistics we mostly consider classical robust M-estimators. In this sense, the method can be viewed as a blend of classical robust estimation techniques with the Bayesian paradigm. Of considerable interest is the comparison of the new method with more traditional approaches to dealing with outliers. In the face of model inadequacy caused by outliers, the traditional view is that one should build a better model which attempts to explain the outlier generating process. Two classical parametric approaches to deal with the problem are to replace the standard density by a thick-tailed or mixture density. Using several ii data analyses, the benefits of the new approach over the traditional approaches is made apparent. These benefits often manifest themselves through more realistic assumptions and more precise predictive performance. A major contribution of this work is the development of implementation strategies to fit these models. Since restricted likelihoods are rarely tractable, implementation is nontrivial. For low dimensional problems, computational methods relying on density estimation and numerical integration can often be efficiently employed. These methods break down in higher dimensions and we develop a novel Markov Chain Monte Carlo (MCMC) algorithm to handle these situations. In particular, the MCMC algorithm for the traditional posterior is augmented with a step that simulates new data conditional on the parameters and observed summary statistics. Derivations of the adjustments needed for this data augmented algorithm are given for the linear regression setting. Model choice within the restricted likelihood paradigm is also introduced.

[1]  L. M. M.-T. Theory of Probability , 1929, Nature.

[2]  J. I The Design of Experiments , 1936, Nature.

[3]  J. Tukey A survey of sampling from contaminated distributions , 1960 .

[4]  Bruno De Finetti,et al.  The Bayesian Approach to the Rejection of Outliers , 1961 .

[5]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[6]  John W. Pratt,et al.  Bayesian Interpretation of Standard Inference Statements , 1965 .

[7]  G. C. Tiao,et al.  A bayesian approach to some outlier problems. , 1968, Biometrika.

[8]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[9]  J. Dickey The Weighted Likelihood Ratio, Linear Hypotheses on Normal Location Parameters , 1971 .

[10]  A. Dawid Posterior expectations for large observations , 1973 .

[11]  S. Stigler Do Robust Estimators Work with Real Data , 1977 .

[12]  A. O'Hagan,et al.  On Outlier Rejection Phenomena in Bayes Inference , 1979 .

[13]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[14]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[15]  Lennart S. Rhodin,et al.  Robust Estimation of Location Using Optimally Chosen Sample Quantiles , 1980 .

[16]  Anthony N. Pettitt,et al.  Inference for the Linear Model Using a Likelihood Based on Ranks , 1982 .

[17]  A. Pettitt Likelihood based Inference using Signed Ranks for Matched Pairs , 1983 .

[18]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  L. Tierney,et al.  Accurate Approximations for Posterior Moments and Marginal Densities , 1986 .

[20]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[21]  Brian D. Ripley,et al.  Stochastic Simulation , 2005 .

[22]  J. Geweke,et al.  Bayesian Inference in Econometric Models Using Monte Carlo Integration , 1989 .

[23]  L. Tierney,et al.  Fully Exponential Laplace Approximations to Expectations and Variances of Nonpositive Functions , 1989 .

[24]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[25]  Albert Y. Lo,et al.  Consistent and Robust Bayes Procedures for Location Based on Partial Information , 1990 .

[26]  Anthony O'Hagan,et al.  Outliers and Credence for Location Parameter Inference , 1990 .

[27]  L. Wasserman,et al.  Bayesian analysis of outlier problems using the Gibbs sampler , 1991 .

[28]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[29]  Adi Ben-Israel,et al.  On principal angles between subspaces in Rn , 1992 .

[30]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[31]  Roger Ratcliff,et al.  Methods for Dealing With Reaction Time Outliers , 1992 .

[32]  M. C. Jones,et al.  Comparison of Smoothing Parameterizations in Bivariate Kernel Density Estimation , 1993 .

[33]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[34]  A. Gelfand,et al.  Bayesian Model Choice: Asymptotics and Exact Calculations , 1994 .

[35]  M. Wand,et al.  Multivariate plug-in bandwidth selection , 1994 .

[36]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[37]  Posterior convergence given the mean , 1995 .

[38]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[39]  L. Wasserman,et al.  Computing Bayes Factors Using a Generalization of the Savage-Dickey Density Ratio , 1995 .

[40]  D. Madigan,et al.  A method for simultaneous variable selection and outlier identification in linear regression , 1996 .

[41]  Daniel Peña,et al.  Gibbs Sampling Will Fail in Outlier Problems with Strong Masking , 1996 .

[42]  Elvezio Ronchetti,et al.  Robust Linear Model Selection by Cross-Validation , 1997 .

[43]  A. Raftery,et al.  Estimating Bayes Factors via Posterior Simulation with the Laplace—Metropolis Estimator , 1997 .

[44]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[45]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[46]  S. Normand,et al.  TUTORIAL IN BIOSTATISTICS META-ANALYSIS : FORMULATING , EVALUATING , COMBINING , AND REPORTING , 1999 .

[47]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[48]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[49]  Mario Peruggia,et al.  Importance Link Function Estimation for Markov Chain Monte Carlo Methods , 2000 .

[50]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[51]  A. Justel,et al.  Bayesian unmasking in linear models , 2001 .

[52]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[53]  M. Hazelton,et al.  Plug-in bandwidth matrices for bivariate kernel density estimation , 2003 .

[54]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[55]  Prem K. Kythe,et al.  Handbook of Computational Methods for Integration , 2004 .

[56]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[57]  Bertrand Clarke,et al.  Asymptotic normality of the posterior given a statistic , 2004 .

[58]  M. Hazelton,et al.  Cross‐validation Bandwidth Matrices for Multivariate Kernel Density Estimation , 2005 .

[59]  On limiting posterior distributions , 2005 .

[60]  A. O'Hagan,et al.  Statistical Methods for Eliciting Probability Distributions , 2005 .

[61]  Tong Zhang From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.

[62]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[63]  Tony O’Hagan Bayes factors , 2006 .

[64]  J. Berger The case for objective Bayesian analysis , 2006 .

[65]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[66]  M. Tanner,et al.  Gibbs posterior for variable selection in high-dimensional classification and data mining , 2008, 0810.5655.

[67]  Laura Ventura,et al.  Robust likelihood functions in Bayesian inference , 2008 .

[68]  Keith O'Rourke,et al.  The combining of information: Investigating and synthesizing what is possibly common in clinical observations or studies via likelihood. , 2008 .

[69]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[70]  Paul Marjoram,et al.  Statistical Applications in Genetics and Molecular Biology Approximately Sufficient Statistics and Bayesian Computation , 2011 .

[71]  Walter W Piegorsch,et al.  Combining information. , 2009, Wiley interdisciplinary reviews. Computational statistics.

[72]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[73]  Paul Fearnhead,et al.  Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC , 2010, 1004.1112.

[74]  William Francis Darnieder Bayesian Methods for Data-Dependent Priors , 2011 .

[75]  Anthony O'Hagan,et al.  Bayesian heavy-tailed models and conflict resolution: A review , 2012 .

[76]  John Lewis,et al.  Robust Inference via the Blended Paradigm , 2012 .

[77]  Jean-Paul Chilès,et al.  Wiley Series in Probability and Statistics , 2012 .

[78]  R. Kay The Analysis of Survival Data , 2012 .

[79]  Peter D. Hoff,et al.  Likelihoods for fixed rank nomination networks , 2012, Network Science.

[80]  Full Robustness in Bayesian Modelling of a Scale Parameter , 2013 .

[81]  Angela M. Dean,et al.  Design and analysis of experiment , 2013 .

[82]  Juhee Lee,et al.  Inference functions in high dimensional Bayesian inference , 2014 .

[83]  Pier Giovanni Bissiri,et al.  A general framework for updating belief distributions , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.