论文信息 - False Discovery Rate Smoothing - 字舞流文

False Discovery Rate Smoothing

ABSTRACT We present false discovery rate (FDR) smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems. FDR smoothing automatically finds spatially localized regions of significant test statistics. It then relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level. This results in increased power and cleaner spatial separation of signals from noise. The approach requires solving a nonstandard high-dimensional optimization problem, for which an efficient augmented-Lagrangian algorithm is presented. In simulation studies, FDR smoothing exhibits state-of-the-art performance at modest computational cost. In particular, it is shown to be far more robust than existing methods for spatially dependent multiple testing. We also apply the method to a dataset from an fMRI experiment on spatial working memory, where it detects patterns that are much more biologically plausible than those detected by standard FDR-controlling methods. All code for FDR smoothing is publicly available in Python and R (https://github.com/tansey/smoothfdr). Supplementary materials for this article are available online.

Oluwasanmi Koyejo | Russell A. Poldrack | James G. Scott | Wesley Tansey | R. Poldrack | O. Koyejo | Wesley Tansey

[1] Nicholas A. Johnson,et al. A Dynamic Programming Algorithm for the Fused Lasso and L 0-Segmentation , 2013 .

[2] H. Zou. The Adaptive Lasso and Its Oracle Properties , 2006 .

[3] Kenneth Rice,et al. FDR and Bayesian Multiple Comparisons Rules , 2006 .

[4] J. Ghosh,et al. A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing , 2008, 0805.2479.

[5] Timothy O. Laumann,et al. Methods to detect, characterize, and remove motion artifact in resting state fMRI , 2014, NeuroImage.

[6] T. Hastie,et al. SparseNet: Coordinate Descent With Nonconvex Penalties , 2011, Journal of the American Statistical Association.

[7] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[8] P. Hall,et al. Robustness of multiple testing procedures against dependence , 2009, 0903.0464.

[9] Stephen P. Boyd,et al. An ADMM Algorithm for a Class of Total Variation Regularized Estimation Problems , 2012, 1203.1828.

[10] R. Tibshirani,et al. Degrees of freedom in lasso problems , 2011, 1111.0653.

[11] R. Tibshirani,et al. The solution path of the generalized lasso , 2010, 1005.1971.

[12] P. Müller,et al. A Bayesian mixture model for differential gene expression , 2005 .

[13] R. Tibshirani,et al. Sparsity and smoothness via the fused lasso , 2005 .

[14] Antonin Chambolle,et al. On Total Variation Minimization and Surface Evolution Using Parametric Maximum Flows , 2009, International Journal of Computer Vision.

[15] B. Efron. Large-Scale Simultaneous Hypothesis Testing , 2004 .

[16] Wenguang Sun,et al. Large‐scale multiple testing under dependence , 2009 .

[17] Thomas E. Nichols,et al. Validating cluster size inference: random field and permutation methods , 2003, NeuroImage.

[18] James G. Scott,et al. An exploration of aspects of Bayesian multiple testing , 2006 .

[19] R. Dougherty,et al. FALSE DISCOVERY RATE ANALYSIS OF BRAIN DIFFUSION DIRECTION MAPS. , 2008, The annals of applied statistics.

[20] Bradley Efron,et al. Microarrays, Empirical Bayes and the Two-Groups Model. Rejoinder. , 2008, 0808.0572.

[21] Kathryn M. McMillan,et al. N‐back working memory paradigm: A meta‐analysis of normative functional neuroimaging studies , 2005, Human brain mapping.

[22] J. Ghosh,et al. CONSISTENCY OF A RECURSIVE ESTIMATE OF MIXING DISTRIBUTIONS , 2009, 0908.3418.

[23] Wenguang Sun,et al. False discovery control in large‐scale spatial multiple testing , 2015, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[24] Jeffrey T Leek,et al. A general framework for multiple testing dependence , 2008, Proceedings of the National Academy of Sciences.

[25] R. Poldrack. Region of interest analysis for fMRI. , 2007, Social cognitive and affective neuroscience.

[26] M. Newton. On a nonparametric recursive estimator of the mixing distribution , 2002 .

[27] Jeffrey T Leek,et al. Significance analysis and statistical dissection of variably methylated regions. , 2012, Biostatistics.

[28] I. Verdinelli,et al. False Discovery Control for Random Fields , 2004 .

[29] Suvrit Sra,et al. Fast Newton-type Methods for Total Variation Regularization , 2011, ICML.

[30] Y. Benjamini,et al. False Discovery Rates for Spatial Signals , 2007 .

[31] John D. Storey,et al. Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[32] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[33] M. Newton. Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis , 2008 .

[34] James G. Scott,et al. Local shrinkage rules, Lévy processes and regularized regression , 2010, 1010.3390.

[35] Bradley Efron,et al. Large-scale inference , 2010 .

[36] James G. Scott,et al. False Discovery Rate Regression: An Application to Neural Synchrony Detection in Primary Visual Cortex , 2013, Journal of the American Statistical Association.

[37] B. Efron. SIMULTANEOUS INFERENCE : WHEN SHOULD HYPOTHESIS TESTING PROBLEMS BE COMBINED? , 2008, 0803.3863.

[38] Tao Yu,et al. MULTIPLE TESTING VIA FDRL FOR LARGE SCALE IMAGING DATA , 2011, 1103.1966.

[39] Thomas E. Nichols,et al. Handbook of Functional MRI Data Analysis: Index , 2011 .

[40] Russell A. Poldrack,et al. Handbook of Functional MRI Data Analysis: Visualizing, localizing, and reporting fMRI data , 2011 .

[41] James G. Scott,et al. Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem , 2010, 1011.2333.

[42] J. Berger,et al. Testing Precise Hypotheses , 1987 .

[43] Tao Yu,et al. MULTIPLE TESTING VIA FDRL FOR LARGE SCALE IMAGING DATA , 2011 .

[44] P. Bickel,et al. SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[45] Hans Knutsson,et al. Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates , 2016, Proceedings of the National Academy of Sciences.

[46] Ryan Martin,et al. A nonparametric empirical Bayes framework for large-scale multiple testing. , 2011, Biostatistics.

[47] L. Rudin,et al. Nonlinear total variation based noise removal algorithms , 1992 .

[48] Nancy Kanwisher,et al. Broad domain generality in focal regions of frontal and parietal cortex , 2013, Proceedings of the National Academy of Sciences.

[49] Steen Moeller,et al. Multiband multislice GE‐EPI at 7 tesla, with 16‐fold acceleration using partial parallel imaging with application to high spatial and temporal whole‐brain fMRI , 2010, Magnetic resonance in medicine.

[50] Bin Nan,et al. Multiple testing for neuroimaging via hidden Markov random field , 2014, Biometrics.

[51] Thomas E. Nichols. Multiple testing corrections, nonparametric methods, and random field theory , 2012, NeuroImage.