Escaping the Bonferroni iron claw in ecological studies

I analyze some criticisms made about the application of alphainflation correction procedures to repeated-test tables in ecological studies. Common pitfalls during application, the statistical properties of many ecological datasets, and the strong control of the tablewise error rate made by the widely used sequential Bonferroni procedures, seem to be responsible for some ‘illogical’ results when such corrections are applied. Sharpened Bonferroni-type procedures may alleviate the decrease in power associated to standard methods as the number of tests increases. More powerful methods, based on controlling the false discovery rate (FDR), deserve a more frequent use in ecological studies, especially in those involving large repeated-test tables in which several or many individual null hypotheses have been rejected, and the most significant p-value is relatively large. I conclude that some reasonable control of alpha inflation is required of authors as a safeguard against striking, but spurious findings, which may strongly affect the credibility of ecological research. Moran (2003) recently suggested rejecting the application of the sequential Bonferroni rule in ecological studies. He based his proposal on certain mathematical, logical, and practical objections which led him to conclude that it would be better for ecological research to abandon the awkward constraints derived from the sequential Bonferroni rule, allowing the researcher to interpret more freely the multiple test outcomes without testing for alpha inflation. Thereby, detailed ecological research would be stimulated, while avoiding the loss of potentially relevant results, which are at risk of remaining unknown when authors are required to adhere strictly to the sequential Bonferroni rule. The likely increase in the frequency of ‘false positives’ in the ecological literature would be of minor importance, since these spurious results will not be confirmed by subsequent experiments. In other contexts, even stronger claims against alpha corrections have recently been the subject of controversy (Perneger 1998, 1999, Feise 2002). Surprisingly few people have questioned the same corrections which are implicit in the standard post hoc methods routinely applied to perform multiple comparisons between treatments for a single dependent variable. Accepting Moran’s arguments, it could be argued that relevant research results are perhaps not being published because people use these alpha-corrected methods instead of looking directly at the individual pairwise-test p-values. There is an apparent inconsistency between the unquestioned acceptance of the ‘‘alpha inflation under repeated test’’ principle in the univariate case, and the controversy about the convenience or not of applying the same statistical principle in the multivariate case. Arbitrary rejection of the application of a wellfounded statistical principle does not seem an acceptable scientific solution for a problem. If the way in which alpha-inflation corrections are routinely applied in multivariate ecological studies does not work, it seems more reasonable to analyze and improve the procedures rather than simply ‘kill the principle’.

[1]  M. Bartlett,et al.  A note on the multiplying factors for various chi square approximations , 1954 .

[2]  P. Seeger A Note on a Method for the Analysis of Significances en masse , 1968 .

[3]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[4]  E. Spjøtvoll,et al.  Plots of P-values to evaluate many tests simultaneously , 1982 .

[5]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[6]  G. Hommel A stagewise rejective multiple test procedure based on a modified Bonferroni test , 1988 .

[7]  A. Tamhane,et al.  Multiple Comparison Procedures , 1989 .

[8]  C. J. Huberty,et al.  Multivariate analysis versus multiple univariate analyses. , 1989 .

[9]  Y. Benjamini,et al.  More powerful procedures for multiple significance testing. , 1990, Statistics in medicine.

[10]  J. Booth,et al.  Resampling-Based Multiple Testing. , 1994 .

[11]  A. Tamhane,et al.  Step-up multiple testing of parameters with unequally correlated estimates. , 1995, Biometrics.

[12]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[13]  M F Huque,et al.  Some comments on frequently used multiple endpoint adjustment methods in clinical trials. , 1997, Statistics in medicine.

[14]  T. Perneger What's wrong with Bonferroni adjustments , 1998, BMJ.

[15]  M Pagano,et al.  Multiple comparisons: a cautionary tale about the dangers of fishing expeditions. , 1999, Nutrition.

[16]  Y. Benjamini,et al.  Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .

[17]  Russell D. Wolfinger,et al.  Multiple Comparisons and Multiple Tests Using the SAS System , 1999 .

[18]  M A Waclawiw,et al.  Practical guidelines for multiplicity adjustment in clinical trials. , 2000, Controlled clinical trials.

[19]  J. Troendle,et al.  Stepwise normal theory multiple test procedures controlling the false discovery rate , 2000 .

[20]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[21]  S. Lange,et al.  Adjusting for multiple testing--when and how? , 2001, Journal of clinical epidemiology.

[22]  Marti J. Anderson,et al.  A new method for non-parametric multivariate analysis of variance in ecology , 2001 .

[23]  M. J.,et al.  CONTROLLING THE FALSE-DISCOVERY RATE IN ASTROPHYSICAL DATA ANALYSIS , 2001 .

[24]  J. Cheverud,et al.  A simple correction for multiple comparisons in interval mapping genome scans , 2001, Heredity.

[25]  Jonathan A C Sterne,et al.  Sifting the evidence—what's wrong with significance tests? , 2001, BMJ : British Medical Journal.

[26]  S. Sarkar Some Results on False Discovery Rate in Stepwise multiple testing procedures , 2002 .

[27]  R. Redondo,et al.  Seagull influence on soil properties, chenopod shrub distribution, and leaf nutrient status in semi-arid Mediterranean islands , 2002 .

[28]  John D. Storey A direct approach to false discovery rates , 2002 .

[29]  R. Feise Do multiple outcome measures require p-value adjustment? , 2002, BMC medical research methodology.

[30]  David R. Bickel Error-rate and decision-theoretic methods of multiple testing: Alternatives to controling conventional false discovery rates, with an application to microarrays , 2002 .

[31]  Siu Hung Cheung,et al.  Familywise robustness criteria for multiple‐comparison procedures , 2002 .

[32]  J. Espinar,et al.  Submerged macrophyte zonation in a Mediterranean salt marsh: a facilitation effect from established helophytes? , 2002 .

[33]  R. Tibshirani,et al.  Empirical bayes methods and false discovery rates for microarrays , 2002, Genetic epidemiology.

[34]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[35]  Thomas E. Nichols,et al.  Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate , 2002, NeuroImage.

[36]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[37]  M. Moran Arguments for rejecting the sequential Bonferroni in ecological studies , 2003 .

[38]  Luis V. García,et al.  Controlling the false discovery rate in ecological research. , 2003 .

[39]  R. Fernando,et al.  Controlling the Proportion of False Positives in Multiple Dependent Tests , 2004, Genetics.

[40]  H. Keselman,et al.  Multiple Comparison Procedures , 2005 .