On the logic of hypothesis testing in functional imaging

Statistics is nowadays the customary language of functional imaging. It is common to express an experimental setting as a set of null hypotheses over complex models and to present results as maps of p-values derived from sophisticated probability distributions. However, the growing interest in the development of advanced statistical algorithms is not always paralleled by similar attention to how these techniques may regiment the ways in which users draw inferences from their data. This article investigates the logical bases of current statistical approaches in functional imaging and probes their suitability to inductive inference in neuroscience. The frequentist approach to statistical inference is reviewed with attention to its two main constituents: Fisherian “significance testing” and Neyman-Pearson “hypothesis testing”. It is shown that these conceptual systems, which are similar in the univariate testing case, dissociate into two quite different methods of inference when applied to the multiple testing problem, the typical framework of functional imaging. This difference is explained with reference to specific issues, like small volume correction, which are most likely to generate confusion in the practitioner. Further insight into this problem is achieved by recasting the multiple comparison problem into a multivariate Bayesian formulation. This formulation introduces a new perspective where the inferential process is more clearly defined in two distinct steps. The first one, inductive in form, uses exploratory techniques to acquire preliminary notions on the spatial patterns and the signal and noise characteristics. The (smaller) set of likely spatial patterns generated is then tested with newer data and a more rigorous multiple hypothesis testing technique (deductive step).

[1]  Thomas E. Nichols,et al.  Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate , 2002, NeuroImage.

[2]  Y. Benjamini,et al.  More powerful procedures for multiple significance testing. , 1990, Statistics in medicine.

[3]  S. Goodman,et al.  Multiple comparisons, explained. , 1998, American journal of epidemiology.

[4]  Gideon Keren,et al.  A Handbook for data analysis in the behavioral sciences : methodological issues , 1993 .

[5]  Karl J. Friston,et al.  Assessing the significance of focal activations using their spatial extent , 1994, Human brain mapping.

[6]  Karl J. Friston,et al.  Classical and Bayesian Inference in Neuroimaging: Theory , 2002, NeuroImage.

[7]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Alan C. Evans,et al.  A Three-Dimensional Statistical Analysis for CBF Activation Studies in Human Brain , 1992, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[9]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[10]  C. Bonferroni Il calcolo delle assicurazioni su gruppi di teste , 1935 .

[11]  E. Lehmann The Fisher, Neyman-Pearson Theories of Testing Hypotheses: One Theory or Two? , 1993 .

[12]  Stephen M. Stigler,et al.  STIGLER'S LAW OF EPONYMY† , 1980 .

[13]  Karl J. Friston,et al.  The Relationship between Global and Local Changes in PET Scans , 1990, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[14]  S. C. Strother,et al.  The Quantitative Evaluation of Functional Neuroimaging Experiments: Mutual Information Learning Curves , 2002, NeuroImage.

[15]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  T. Schweder,et al.  A significance version of the basic Neyman-Pearson theory for scientific hypothesis testing , 1988 .

[18]  Eric R. Ziegel,et al.  Multiple Comparisons, Selection and Applications in Biometry , 1992 .

[19]  J C Gore,et al.  An roc approach for evaluating functional brain mr imaging and postprocessing protocols , 1995, Magnetic resonance in medicine.

[20]  I. Johnstone,et al.  Adapting to unknown sparsity by controlling the false discovery rate , 2005, math/0505374.

[21]  Gerd Gigerenzer,et al.  The superego, the ego, and the id in statistical reasoning , 1993 .

[22]  S. Goodman Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy , 1999, Annals of Internal Medicine.

[23]  D. Johnstone,et al.  Tests of Significance in Theory and Practice , 1986 .

[24]  J. Tukey The Future of Data Analysis , 1962 .

[25]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[26]  F. Turkheimer,et al.  Estimation of the Number of “True” Null Hypotheses in Multivariate Analysis of Neuroimaging Data , 2001, NeuroImage.

[27]  Lawrence Sklar,et al.  Philosophical problems of statistical inference , 1981 .

[28]  Dean Phillips Foster,et al.  Calibration and Empirical Bayes Variable Selection , 1997 .

[29]  E. Zarahn,et al.  A Reference Effect Approach for Power Analysis in fMRI , 2001, NeuroImage.

[30]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[31]  Peter A. Bandettini,et al.  Detection versus Estimation in Event-Related fMRI: Choosing the Optimal Stimulus Timing , 2002, NeuroImage.

[32]  Rory A. Fisher,et al.  Statistical methods and scientific inference. , 1957 .

[33]  S. Goodman,et al.  A comment on replication, p-values and evidence. , 1992, Statistics in medicine.

[34]  E. Lander Array of hope , 1999, Nature Genetics.

[35]  Karl J. Friston,et al.  Comparing Functional (PET) Images: The Assessment of Significant Change , 1991, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[36]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[37]  L. Wittgenstein Tractatus Logico-Philosophicus , 2021, Nordic Wittgenstein Review.