Ecologists should not use statistical significance tests to interpret simulation model results

Simulation models are widely used to represent the dynamics of ecological systems. A common question with such models is how changes to a parameter value or functional form in the model alter the results. Some authors have chosen to answer that question using frequentist statistical hypothesis tests (e.g. ANOVA). �怀is is inappropriate for two reasons. First, p-values are determined by statistical power (i.e. replication), which can be arbitrarily high in a simulation context, producing minuscule p-values regardless of the effect size. Second, the null hypothesis of no difference between treatments (e.g. parameter values) is known a priori to be false, invalidating the premise of the test. Use of p-values is troublesome (rather than simply irrelevant) because small p-values lend a false sense of importance to observed differences. We argue that modelers should abandon this practice and focus on evaluating the magnitude of differences between simulations. A growing number of authors in the ecological literature use statistical methods common to experimental ecology to analyze the output of ecological simulation models. For example, authors may use analysis of variance (ANOVA) to test whether model runs with different parameter values or different functional forms produce statistically different outputs. We view significance testing applied to simulation model output as a misuse of statistical theory. In this article we explain our reasoning with the goals of discouraging the practice, encouraging instead a focus on the magnitude of differences between simulations (i.e. effect sizes), and sparking discussion regarding when ‐ if ever ‐ statistical significance tests could be appropriate. �怀e perils of placing too much emphasis on statistical

[1]  George Sugihara,et al.  Why fishing magnifies fluctuations in fish abundance , 2008, Nature.

[2]  David R. Anderson,et al.  Null Hypothesis Testing: Problems, Prevalence, and an Alternative , 2000 .

[3]  Luc F Bussière,et al.  Offspring size variation within broods as a bet-hedging strategy in unpredictable environments. , 2008, Ecology.

[4]  Stelios Katsanevakis,et al.  Strengthening statistical usage in marine ecology , 2012 .

[5]  Chris J. Harvey,et al.  Quantitative Evaluation of Marine Ecosystem Indicator Performance Using Food Web Models , 2009, Ecosystems.

[6]  Nigel G. Yoccoz,et al.  Use, Overuse, and Misuse of Significance Tests in Evolutionary Biology and Ecology , 1991, The Bulletin of the Ecological Society of America.

[7]  P. Coquillard,et al.  Host kairomone learning and foraging success in an egg parasitoid: a simulation model , 2009 .

[8]  Oscar E. Gaggiotti,et al.  Computer simulations: tools for population and evolutionary genetics , 2012, Nature Reviews Genetics.

[9]  George Sugihara,et al.  Nonlinear forecasting for the classification of natural time series , 1994, Philosophical Transactions of the Royal Society of London. Series A: Physical and Engineering Sciences.

[10]  Joseph Berkson Tests of significance considered as evidence , 2003 .

[11]  Pierre Legendre,et al.  Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. , 2013, Ecology letters.

[12]  Joseph Berkson,et al.  Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test , 1938 .

[13]  P. Moksnes,et al.  Depth distribution of larvae critically affects their dispersal and the efficiency of marine protected areas , 2012 .

[14]  Douglas H. Johnson The Insignificance of Statistical Significance Testing , 1999 .

[15]  William W. Murdoch,et al.  Switching, Functional Response, and Stability in Predator-Prey Systems , 1975, The American Naturalist.

[16]  Robert M. May,et al.  The spatial dynamics of host-parasitoid systems , 1992 .

[17]  W. W. Daniel Applied Nonparametric Statistics , 1979 .

[18]  David A Siegel,et al.  Turbulent dispersal promotes species coexistence , 2010, Ecology letters.

[19]  Put O. Ang,et al.  Object-oriented simulation of coral competition in a coral reef community , 2012 .

[20]  Craig R. Johnson,et al.  Sensitivity analysis and pattern-oriented validation of TRITON, a model with alternative community states: Insights on temperate rocky reefs dynamics , 2013 .

[21]  Celia M. Lombardi,et al.  Final Collapse of the Neyman-Pearson Decision Theoretic Framework and Rise of the neoFisherian , 2009 .

[22]  S. Carpenter,et al.  Methods for Detecting Early Warnings of Critical Transitions in Time Series Illustrated Using Simulated Ecological Data , 2012, PloS one.

[23]  Florian Jeltsch,et al.  Sensitivity of plant functional types to climate change: classification tree analysis of a simulation model , 2010 .

[24]  William W. Murdoch,et al.  Functional Response and Stability in Predator-Prey Systems , 1975, The American Naturalist.

[25]  André E. Punt,et al.  Which ecological indicators can robustly detect effects of fishing , 2005 .

[26]  Steven C. Walker,et al.  Testing the standard neutral model of biodiversity in lake communities , 2007 .

[27]  S. Shott,et al.  Nonparametric Statistics , 2018, The Encyclopedia of Archaeological Sciences.

[28]  Matthew R. Hipsey,et al.  Exploring the role of fish in a lake ecosystem (Lake Kinneret, Israel) by coupling an individual-based fish population model to a dynamic ecosystem model , 2011 .

[29]  Philippe Cury,et al.  Simulating and testing the sensitivity of ecosystem-based indicators to fishing in the southern Benguela ecosystem , 2006 .

[30]  Leslie A. Real,et al.  Monte Carlo assessments of goodness-of-fit for ecological simulation models , 2003 .