Controlling the rate of Type I error over a large set of statistical tests.

When many tests of significance are examined in a research investigation with procedures that limit the probability of making at least one Type I error--the so-called familywise techniques of control--the likelihood of detecting effects can be very low. That is, when familywise error controlling methods are adopted to assess statistical significance, the size of the critical value that must be exceeded in order to obtain statistical significance can be extremely large when the number of tests to be examined is also very large. In our investigation we examined three methods for increasing the sensitivity to detect effects when family size is large: the false discovery rate of error control presented by Benjamini and Hochberg (1995), a modified false discovery rate presented by Benjamini and Hochberg (2000) which estimates the number of true null hypotheses prior to adopting false discovery rate control, and a familywise method modified to control the probability of committing two or more Type I errors in the family of tests examined--not one, as is the case with the usual familywise techniques. Our results indicated that the level of significance for the two or more familywise method of Type I error control varied with the testing scenario and needed to be set on occasion at values in excess of 0.15 in order to control the two or more rate at a reasonable value of 0.01. In addition, the false discovery rate methods typically resulted in substantially greater power to detect non-null effects even though their levels of significance were set at the standard 0.05 value. Accordingly, we recommend the Benjamini and Hochberg (1995, 2000) methods of Type I error control when the number of tests in the family is large.

[1]  K. Gabriel,et al.  A Study of the Powers of Several Methods of Multiple Comparisons , 1975 .

[2]  R. Knoop Job Involvement: An Elusive Concept , 1986 .

[3]  D. Saville Multiple Comparison Procedures: The Practical Solution , 1990 .

[4]  Bernard Mazoyer,et al.  Functional connectivity in depressive, obsessive–compulsive, and schizophrenic disorders: an explorative correlational analysis of regional cerebral metabolism , 1998, Psychiatry Research: Neuroimaging.

[5]  Carl J. Huberty,et al.  Multiple Testing and Statistical Power With Modified Bonferroni Procedures , 1997 .

[6]  John W. Tukey,et al.  Controlling Error in Multiple Comparisons, with Examples from State-to-State Differences in Educational Achievement , 1999 .

[7]  Y. Benjamini,et al.  A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence , 1999 .

[8]  Eric R. Ziegel,et al.  Multiple Comparisons and Multiple Tests , 2000 .

[9]  K. K. Lan,et al.  Some implications of an alternative definition of the multiple comparison problem , 1988 .

[10]  W. Wilson,et al.  A note on the incosistency inherent in the necessity to perform multiple comparisons. , 1962, Psychological bulletin.

[11]  S. Sarkar Some probability inequalities for ordered $\rm MTP\sb 2$ random variables: a proof of the Simes conjecture , 1998 .

[12]  K J Rothman,et al.  No Adjustments Are Needed for Multiple Comparisons , 1990, Epidemiology.

[13]  R. Elston,et al.  False discoveries in genome scanning , 1997, Genetic epidemiology.

[14]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[15]  Robert A. Cribbie,et al.  The pairwise multiple comparison multiplicity problem: An alternative approach to familywise and comparison wise Type I error control. , 1999 .

[16]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[17]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[18]  Y. Benjamini,et al.  Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .

[19]  Jeffery S. Schippmann,et al.  Psychometric Evaluation of an Integrated Assessment Procedure , 1986 .

[20]  Carl J. Huberty,et al.  Statistical Practices of Educational Researchers: An Analysis of their ANOVA, MANOVA, and ANCOVA Analyses , 1998 .

[21]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[22]  S. Sarkar,et al.  The Simes Method for Multiple Hypothesis Testing with Positively Dependent Test Statistics , 1997 .

[23]  D. Rom A sequentially rejective test procedure based on a modified Bonferroni inequality , 1990 .

[24]  Y. Benjamini,et al.  More powerful procedures for multiple significance testing. , 1990, Statistics in medicine.

[25]  Kern W. Dickman,et al.  Sample and population score matrices and sample correlation matrices from an arbitrary population correlation matrix , 1962 .

[26]  Siu Hung Cheung,et al.  Familywise robustness criteria for multiple‐comparison procedures , 2002 .