SSP: An R package to estimate sampling effort in studies of ecological communities

SSP (simulation-based sampling protocol) is an R package that uses simulation of ecological data and dissimilarity-based multivariate standard error (MultSE) as an estimator of precision to evaluate the adequacy of different sampling efforts for studies that will test hypothesis using permutational multivariate analysis of variance. The procedure consists in simulating several extensive data matrixes that mimic some of the relevant ecological features of the community of interest using a pilot data set. For each simulated data, several sampling efforts are repeatedly executed and MultSE calculated. The mean value, 0.025 and 0.975 quantiles of MultSE for each sampling effort across all simulated data are then estimated and standardized regarding the lowest sampling effort. The optimal sampling effort is identified as that in which the increase in sampling effort do not improve the precision beyond a threshold value (e.g. 2.5 %). The performance of SSP was validated using real data, and in all examples the simulated data mimicked well the real data, allowing to evaluate the relationship MultSE – n beyond the sampling size of the pilot studies. SSP can be used to estimate sample size in a wide range of situations, ranging from simple (e.g. single site) to more complex (e.g. several sites for different habitats) experimental designs. The latter constitutes an important advantage, since it offers new possibilities for complex sampling designs, as it has been advised for multi-scale studies in ecology.

[1]  M. Chapman,et al.  A method for analysing spatial scales of variation in composition of assemblages , 1998, Oecologia.

[2]  G. Quinn,et al.  Experimental Design and Data Analysis for Biologists , 2002 .

[3]  Oliver Purschke,et al.  Embracing scale-dependence to achieve a deeper understanding of biodiversity and its change across communities , 2018, bioRxiv.

[4]  Noe C. Barrera,et al.  First survey of Interstitial molluscs from Cayo Nuevo, Campeche Bank, Gulf of Mexico , 2018, ZooKeys.

[5]  N. Simões,et al.  Marine sponges (Porifera: Demospongiae) from the Gulf of México, new records and redescription of Erylus trisphaerus (de Laubenfels, 1953). , 2015, Zootaxa.

[6]  Marti J. Anderson,et al.  Measures of precision for dissimilarity-based multivariate analysis of ecological communities , 2014, Ecology letters.

[7]  B. Mapstone Scalable Decision Rules for Environmental Impact Studies: Effect Size, Type I, and Type II Errors , 1995 .

[8]  J. Cruz‐Motta,et al.  Scales of spatial variation in tropical benthic assemblages and their ecological relevance: epibionts on Caribbean mangrove roots as a model system , 2016 .

[9]  Perry de Valpine,et al.  A pathway for multivariate analysis of ecological communities using copulas , 2019, Ecology and evolution.

[10]  Marti J. Anderson,et al.  Variance heterogeneity, transformations, and models of species abundance: a cautionary tale , 2004 .

[11]  K. R. Clarke,et al.  Statistical Design And Analysis For A Biological Effects Study , 1988 .

[12]  M. Chapman,et al.  Power, precaution, Type II error and sampling design in assessment of environmental impacts , 2003 .

[13]  P. Legendre,et al.  A new cost‐effective approach to survey ecological communities , 2016 .

[14]  Yves Tillé,et al.  Sampling Algorithms , 2011, International Encyclopedia of Statistical Science.

[15]  Pierre Legendre,et al.  Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. , 2013, Ecology letters.

[16]  K. R. Clarke,et al.  Non‐parametric multivariate analyses of changes in community structure , 1993 .

[17]  J. Hatfield Experiments in Ecology: Their Logical Design and Interpretation Using Analysis of Variance , 1998 .

[18]  Marti J. Anderson,et al.  PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? , 2013 .

[19]  Marti J. Anderson,et al.  Multivariate dispersion as a measure of beta diversity. , 2006, Ecology letters.

[20]  R. Green,et al.  Sampling design and statistical methods for environmental biologists. , 1979 .

[21]  Miodrag Lovric,et al.  International Encyclopedia of Statistical Science , 2011 .

[22]  Anne E. Magurran,et al.  Biological Diversity: Frontiers in Measurement and Assessment , 2011 .

[23]  Marti J. Anderson,et al.  Distance‐Based Tests for Homogeneity of Multivariate Dispersions , 2006, Biometrics.

[24]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[25]  K. R. Clarke,et al.  Dispersion-based weighting of species counts in assemblage analyses , 2006 .

[26]  Jonathan M. Chase,et al.  The metacommunity concept: a framework for multi-scale community ecology , 2004 .