Multiple comparison correction methods for whole-body magnetic resonance imaging

Abstract. Purpose: Voxel-level hypothesis testing on images suffers from test multiplicity. Numerous correction methods exist, mainly applied and evaluated on neuroimaging and synthetic datasets. However, newly developed approaches like Imiomics, using different data and less common analysis types, also require multiplicity correction for more reliable inference. To handle the multiple comparisons in Imiomics, we aim to evaluate correction methods on whole-body MRI and correlation analyses, and to develop techniques specifically suited for the given analyses. Approach: We evaluate the most common familywise error rate (FWER) limiting procedures on whole-body correlation analyses via standard (synthetic no-activation) nominal error rate estimation as well as smaller prior-knowledge based stringency analysis. Their performance is compared to our anatomy-based method extensions. Results: Results show that nonparametric methods behave better for the given analyses. The proposed prior-knowledge based evaluation shows that the devised extensions including anatomical priors can achieve the same power while keeping the FWER closer to the desired rate. Conclusions: Permutation-based approaches perform adequately and can be used within Imiomics. They can be improved by including information on image structure. We expect such method extensions to become even more relevant with new applications and larger datasets.

[1]  Michael B. Miller,et al.  The principled control of false positives in neuroimaging. , 2009, Social cognitive and affective neuroscience.

[2]  Nava Rubin,et al.  Cluster-based analysis of FMRI data , 2006, NeuroImage.

[3]  Hans Knutsson,et al.  Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates , 2016, Proceedings of the National Academy of Sciences.

[4]  Thomas E. Nichols,et al.  luster-based computational methods for mass univariate analyses f event-related brain potentials / fields : A simulation study , 2022 .

[5]  Stephen M. Smith,et al.  Threshold-free cluster enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference , 2009, NeuroImage.

[6]  S. Dudoit,et al.  Multiple Testing Procedures with Applications to Genomics , 2007 .

[7]  Thomas E. Nichols,et al.  Probabilistic TFCE: A generalized combination of cluster size and voxel intensity to increase statistical power , 2019, NeuroImage.

[8]  Alan C. Evans,et al.  A Three-Dimensional Statistical Analysis for CBF Activation Studies in Human Brain , 1992, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[9]  Thomas E. Nichols,et al.  Controlling the familywise error rate in functional neuroimaging: a comparative review , 2003, Statistical methods in medical research.

[10]  Tao Yu,et al.  MULTIPLE TESTING VIA FDRL FOR LARGE SCALE IMAGING DATA , 2011, 1103.1966.

[11]  Karl J. Friston,et al.  Detecting Activations in PET and fMRI: Levels of Inference and Power , 1996, NeuroImage.

[12]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[13]  Giuseppe Petralia,et al.  Whole-Body Magnetic Resonance Imaging in Oncology: Uses and Indications. , 2018, Magnetic resonance imaging clinics of North America.

[14]  Michael Wolf,et al.  Control of generalized error rates in multiple testing , 2007, 0710.2258.

[15]  Anjali Krishnan,et al.  Cluster-extent based thresholding in fMRI analyses: Pitfalls and recommendations , 2014, NeuroImage.

[16]  Pallavi Basu,et al.  Weighted False Discovery Rate Control in Large-Scale Multiple Testing , 2015, Journal of the American Statistical Association.

[17]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[18]  Y. Benjamini,et al.  False Discovery Rates for Spatial Signals , 2007 .

[19]  Tao Yu,et al.  MULTIPLE TESTING VIA FDRL FOR LARGE SCALE IMAGING DATA , 2011 .

[20]  John P A Ioannidis,et al.  Excess significance bias in the literature on brain volume abnormalities. , 2011, Archives of general psychiatry.

[21]  Filip Malmberg,et al.  A concept for holistic whole body MRI data analysis, Imiomics , 2017, PloS one.

[22]  David W. Carmichael,et al.  Is Bonferroni correction more sensitive than Random Field Theory for most fMRI studies , 2016 .

[23]  Thomas E. Nichols,et al.  Nonparametric Permutation Tests for Functional Neuroimaging , 2003 .

[24]  Russell T. Shinohara,et al.  Faster family‐wise error control for neuroimaging with a parametric bootstrap , 2017, Biostatistics.

[25]  Klaus Scheffler,et al.  LISA improves statistical analysis for fMRI , 2018, Nature Communications.

[26]  Mike Angstadt,et al.  Reevaluating “cluster failure” in fMRI using nonparametric control of the false discovery rate , 2017, Proceedings of the National Academy of Sciences.

[27]  Michael B. Miller,et al.  of Serendipitous and Unexpected Results Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon : An Argument For Proper Multiple Comparisons Correction , 2010 .

[28]  Gabriele Lohmann,et al.  The multiple comparison problem in fMRI a new method based on anatomical priors , 2008 .

[29]  Stephen M. Smith,et al.  Faster permutation inference in brain imaging , 2016, NeuroImage.

[30]  John R. Stevens,et al.  A comparison of multiple testing adjustment methods with block-correlation positively-dependent tests , 2017, PloS one.

[31]  Bostjan Likar,et al.  Intensity inhomogeneity correction of multispectral MR images , 2006, NeuroImage.

[32]  Thomas E. Nichols,et al.  Validating cluster size inference: random field and permutation methods , 2003, NeuroImage.

[33]  Karl J. Friston,et al.  A unified statistical approach for determining significant signals in images of cerebral activation , 1996, Human brain mapping.

[34]  Joseph P. Romano,et al.  Generalizations of the familywise error rate , 2005, math/0507420.

[35]  John Suckling,et al.  Global, voxel, and cluster tests, by theory and permutation, for a difference between two groups of structural MR images of the brain , 1999, IEEE Transactions on Medical Imaging.

[36]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[37]  S. P. Wright,et al.  Adjusted P-values for simultaneous inference , 1992 .

[38]  Sandrine Dudoit,et al.  Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates , 2004, Statistical applications in genetics and molecular biology.

[39]  Thomas E. Nichols,et al.  Nonparametric permutation tests for functional neuroimaging: A primer with examples , 2002, Human brain mapping.

[40]  Emery N Brown,et al.  Controversy in statistical analysis of functional magnetic resonance imaging data , 2017, Proceedings of the National Academy of Sciences.

[41]  CM Bennett,et al.  Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: an argument for multiple comparisons correction , 2009, NeuroImage.

[42]  William A. Cunningham,et al.  Type I and Type II error concerns in fMRI research: re-balancing the scale. , 2009, Social cognitive and affective neuroscience.

[43]  Amanda F. Mejia,et al.  Zen and the Art of Multiple Comparisons , 2015, Psychosomatic medicine.

[44]  Simon B. Eickhoff,et al.  Testing anatomically specified hypotheses in functional imaging using cytoarchitectonic maps , 2006, NeuroImage.

[45]  Stephen M. Smith,et al.  Permutation inference for the general linear model , 2014, NeuroImage.

[46]  O. J. Dunn Multiple Comparisons among Means , 1961 .