On the Null Distribution of Bayes Factors in Linear Regression

ABSTRACT We show that under the null, the is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and the normal prior. Our results have three immediate impacts. First, we can compute analytically a p-value associated with a Bayes factor without the need of permutation. We provide a software package that can evaluate the p-value associated with Bayes factor efficiently and accurately. Second, the null distribution is illuminating to some intrinsic properties of Bayes factor, namely, how Bayes factor quantitatively depends on prior and the genesis of Bartlett’s paradox. Third, enlightened by the null distribution of Bayes factor, we formulate a novel scaled Bayes factor that depends less on the prior and is immune to Bartlett’s paradox. When two tests have an identical p-value, the test with a larger power tends to have a larger scaled Bayes factor, a desirable property that is missing for the (unscaled) Bayes factor. Supplementary materials for this article are available online.

[1]  Min A. Jhun,et al.  A statistical approach for rare-variant association testing in affected sibships. , 2015, American journal of human genetics.

[2]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[3]  Peter E. Kennedy Randomization Tests in Econometrics , 1995 .

[4]  Jon Wakefield,et al.  Reporting and interpretation in genome-wide association studies. , 2008, International journal of epidemiology.

[5]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[6]  H. Jeffreys A Treatise on Probability , 1922, Nature.

[7]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[8]  Anthony O'Hagan,et al.  Kendall's Advanced Theory of Statistics, volume 2B: Bayesian Inference, second edition , 2004 .

[9]  James M. Robins,et al.  Asymptotic Distribution of P Values in Composite Null Models , 2000 .

[10]  Gadi Wollstein,et al.  Genome-wide analysis of central corneal thickness in primary open-angle glaucoma cases in the NEIGHBOR and GLAUGEN consortia. , 2012, Investigative ophthalmology & visual science.

[11]  T. J. Mitchell,et al.  Bayesian variable selection in regression , 1987 .

[12]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[13]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[14]  I. Good Saddle-point Methods for the Multinomial Distribution , 1957 .

[15]  Nathaniel Rothman,et al.  Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. , 2004, Journal of the National Cancer Institute.

[16]  Yongtao Guan,et al.  Detecting Local Haplotype Sharing and Haplotype Association , 2014, Genetics.

[17]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[18]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[19]  H. Kipen,et al.  Questions and Answers 1 , 1994 .

[20]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[21]  真田 昌 骨髄異形成症候群のgenome-wide analysis , 2013 .

[22]  M. Bartlett A comment on D. V. Lindley's statistical paradox , 1957 .

[23]  A. Hofman,et al.  Common Genetic Determinants of Intraocular Pressure and Primary Open-Angle Glaucoma , 2012, PLoS genetics.

[24]  Adriana I. Iglesias,et al.  Genome-wide analysis of multiethnic cohorts identifies new loci influencing intraocular pressure and susceptibility to glaucoma , 2014, Nature Genetics.

[25]  Regina Nuzzo,et al.  Scientific method: Statistical errors , 2014, Nature.

[26]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[27]  M. Stephens,et al.  Bayesian variable selection regression for genome-wide association studies and other large-scale problems , 2011, 1110.6019.

[28]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[29]  Wenguang Sun,et al.  Large‐scale multiple testing under dependence , 2009 .

[30]  Stephen Sawcer,et al.  Bayes factors in complex genetics , 2010, European Journal of Human Genetics.

[31]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[32]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[33]  D. Balding A tutorial on statistical methods for population association studies , 2006, Nature Reviews Genetics.

[34]  G. Abecasis,et al.  Genome-wide association study and meta-analysis of intraocular pressure , 2013, Human Genetics.

[35]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[36]  Marc Engelen,et al.  Zellweger spectrum disorders: clinical overview and management approach , 2015, Orphanet Journal of Rare Diseases.

[37]  Nathaniel Rothman,et al.  Assessing the Probability That a Positive Report is False: An Approach for Molecular Epidemiology Studies , 2004 .

[38]  Xiao-Li Meng,et al.  Posterior Predictive $p$-Values , 1994 .

[39]  S. S. Wilks The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , 1938 .

[40]  T. Ferguson A Course in Large Sample Theory , 1996 .

[41]  Yongtao Guan,et al.  Practical Issues in Imputation-Based Association Mapping , 2008, PLoS genetics.

[42]  Dale R Nyholt,et al.  Association mapping. , 2011, Methods in molecular biology.

[43]  V. Sheffield,et al.  Glaucoma-causing myocilin mutants require the Peroxisomal targeting signal-1 receptor (PTS1R) to elevate intraocular pressure. , 2007, Human molecular genetics.

[44]  F. Medeiros,et al.  The pathophysiology and treatment of glaucoma: a review. , 2014, JAMA.

[45]  Johannes Bausch,et al.  On the efficient calculation of a linear combination of chi-square random variables with an application in counting string vacua , 2012, 1208.2691.

[46]  E. E. Hartmann,et al.  The Ocular Hypertension Treatment Study: a randomized trial determines that topical ocular hypotensive medication delays or prevents the onset of primary open-angle glaucoma. , 2002, Archives of ophthalmology.

[47]  I. Good The Bayes/Non-Bayes Compromise: A Brief Review , 1992 .

[48]  S. Stouffer Adjustment during army life , 1977 .

[49]  Purushottam W. Laud,et al.  On Bayesian Analysis of Generalized Linear Models Using Jeffreys's Prior , 1991 .

[50]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.