False discovery control in large‐scale spatial multiple testing

The paper develops a unified theoretical and computational framework for false discovery control in multiple testing of spatial signals. We consider both pointwise and clusterwise spatial analyses, and derive oracle procedures which optimally control the false discovery rate, false discovery exceedance and false cluster rate. A data‐driven finite approximation strategy is developed to mimic the oracle procedures on a continuous spatial domain. Our multiple‐testing procedures are asymptotically valid and can be effectively implemented using Bayesian computational algorithms for analysis of large spatial data sets. Numerical results show that the procedures proposed lead to more accurate error control and better power performance than conventional methods. We demonstrate our methods for analysing the time trends in tropospheric ozone in eastern USA.

[1]  W. Rudin Real and complex analysis , 1968 .

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  Y. Benjamini,et al.  Multiple Hypotheses Testing with Weights , 1997 .

[4]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[5]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[6]  M. J.,et al.  CONTROLLING THE FALSE-DISCOVERY RATE IN ASTROPHYSICAL DATA ANALYSIS , 2001 .

[7]  Christopher J. Miller,et al.  Controlling the False-Discovery Rate in Astrophysical Data Analysis , 2001, astro-ph/0107034.

[8]  S. Sarkar Some Results on False Discovery Rate in Stepwise multiple testing procedures , 2002 .

[9]  B S Weir,et al.  Truncated product method for combining P‐values , 2002, Genetic epidemiology.

[10]  John D. Storey A direct approach to false discovery rates , 2002 .

[11]  H. Finner,et al.  Multiple hypotheses testing and expected number of type I. errors , 2002 .

[12]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[13]  P. Green,et al.  Hidden Markov Models and Disease Mapping , 2002 .

[14]  Thomas E. Nichols,et al.  Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate , 2002, NeuroImage.

[15]  P. Müller,et al.  Optimal Sample Size for Multiple Testing , 2004 .

[16]  C. Robert,et al.  Optimal Sample Size for Multiple Testing : the Case of Gene Expression Mi roarraysPeter , 2004 .

[17]  Deepayan Sarkar,et al.  Detecting differential gene expression with a semiparametric hierarchical mixture method. , 2004, Biostatistics.

[18]  I. Verdinelli,et al.  False Discovery Control for Random Fields , 2004 .

[19]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[20]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[21]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[22]  A. Owen Variance of the number of false discoveries , 2005 .

[23]  L. Wasserman,et al.  Exceedance Control of the False Discovery Proportion , 2006 .

[24]  Nava Rubin,et al.  Cluster-based analysis of FMRI data , 2006, NeuroImage.

[25]  Steven Skiena,et al.  Meta-analysis based on control of false discovery rate: combining yeast ChIP-chip datasets , 2006, Bioinform..

[26]  Kenneth Rice,et al.  FDR and Bayesian Multiple Comparisons Rules , 2006 .

[27]  B. Singer,et al.  Controlling the False Discovery Rate: A New Application to Account for Multiple and Dependent Tests in Local Statistics of Spatial Association , 2006 .

[28]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[29]  T. Dickhaus,et al.  Dependency and false discovery rate: Asymptotics , 2007, 0710.3171.

[30]  A. P. Dawid,et al.  Bayesian Statistics 8 , 2007 .

[31]  Hongzhe Li,et al.  A Markov random field model for network-based analysis of genomic data , 2007, Bioinform..

[32]  Y. Benjamini,et al.  False Discovery Rates for Spatial Signals , 2007 .

[33]  Wenguang Sun,et al.  Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control , 2007 .

[34]  W. Wu,et al.  On false discovery control under dependence , 2008, 0803.1971.

[35]  R. Dougherty,et al.  FALSE DISCOVERY RATE ANALYSIS OF BRAIN DIFFUSION DIRECTION MAPS. , 2008, The annals of applied statistics.

[36]  Y. Benjamini,et al.  Screening for Partial Conjunction Hypotheses , 2008, Biometrics.

[37]  J. Ghosh,et al.  A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing , 2008, 0805.2479.

[38]  P. Müller,et al.  A Bayesian discovery procedure , 2009, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[39]  P. Hall,et al.  Robustness of multiple testing procedures against dependence , 2009, 0903.0464.

[40]  Wenguang Sun,et al.  Large‐scale multiple testing under dependence , 2009 .

[41]  P. Bickel,et al.  Efficient blind search: Optimal power of detection under computational cost constraints , 2007, 0712.1663.

[42]  H. Boezen,et al.  Genome-wide association studies: what do they teach us about asthma and chronic obstructive pulmonary disease? , 2009, Proceedings of the American Thoracic Society.

[43]  Kai Wang,et al.  Multiple testing in genome-wide association studies via hidden Markov models , 2009, Bioinform..

[44]  A. Gelfand,et al.  Handbook of spatial statistics , 2010 .

[45]  B. Efron Correlated z-Values and the Accuracy of Large-Scale Statistical Estimates , 2010, Journal of the American Statistical Association.

[46]  Momiao Xiong,et al.  Gene and pathway-based second-wave analysis of genome-wide association studies , 2010, European Journal of Human Genetics.

[47]  R. Heller Comment : Correlated z-values and the accuracy of large-scale statistical estimates , 2010 .

[48]  Xihong Lin,et al.  The effect of correlation in false discovery rate estimation. , 2011, Biometrika.

[49]  Judy H. Cho,et al.  Incorporating Biological Pathways via a Markov Random Field Model in Genome-Wide Association Studies , 2011, PLoS genetics.

[50]  M. Marazita,et al.  Genome-wide Association Studies , 2012, Journal of dental research.

[51]  G. Casella,et al.  Springer Texts in Statistics , 2016 .