Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size.

We propose an integrated sampling, rarefaction, and extrapolation methodology to compare species richness of a set of communities based on samples of equal completeness (as measured by sample coverage) instead of equal size. Traditional rarefaction or extrapolation to equal-sized samples can misrepresent the relationships between the richnesses of the communities being compared because a sample of a given size may be sufficient to fully characterize the lower diversity community, but insufficient to characterize the richer community. Thus, the traditional method systematically biases the degree of differences between community richnesses. We derived a new analytic method for seamless coverage-based rarefaction and extrapolation. We show that this method yields less biased comparisons of richness between communities, and manages this with less total sampling effort. When this approach is integrated with an adaptive coverage-based stopping rule during sampling, samples may be compared directly without rarefaction, so no extra data is taken and none is thrown away. Even if this stopping rule is not used during data collection, coverage-based rarefaction throws away less data than traditional size-based rarefaction, and more efficiently finds the correct ranking of communities according to their true richnesses. Several hypothetical and real examples demonstrate these advantages.

[1]  Robert J. Whelan,et al.  THE EDGE EFFECT AND ECOTONAL SPECIES: BIRD COMMUNITIES ACROSS A NATURAL EDGE IN SOUTHEASTERN AUSTRALIA , 2002 .

[2]  Robert K. Colwell,et al.  INTERPOLATING, EXTRAPOLATING, AND COMPARING INCIDENCE-BASED SPECIES ACCUMULATION CURVES , 2004 .

[3]  R. Lande,et al.  When species accumulation curves intersect: implications for ranking diversity using small samples. , 2000 .

[4]  Daniel H. Jazen Sweep Samples of Tropical Foliage Insects: Description of Study Sites, With Data on Species Abundances and Size Distributions , 1973 .

[5]  A. Magurran,et al.  Measuring Biological Diversity , 2004 .

[6]  H. Robbins Estimating the Total Probability of the Unobserved Outcomes of an Experiment , 1968 .

[7]  Warren W. Esty,et al.  The Efficiency of Good's Nonparametric Coverage Estimator , 1986 .

[8]  A. Chao,et al.  Estimating the Number of Classes via Sample Coverage , 1992 .

[9]  Thomas R. Walla,et al.  Species diversity and community structure in neotropical fruit‐feeding butterflies , 2001 .

[10]  Anne Chao,et al.  Measuring and Estimating Species Richness, Species Diversity, and Biotic Similarity from Sampling Data , 2013 .

[11]  Woollcott Smith,et al.  Sampling Properties of a Family of Diversity Measures , 1977 .

[12]  J. Alroy The Shifting Balance of Diversity Among Major Marine Animal Groups , 2010, Science.

[13]  N. Schenker,et al.  On Judging the Significance of Differences by Examining the Overlap Between Confidence Intervals , 2001 .

[14]  D. Janzen Sweep Samples of Tropical Foliage Insects: Effects of Seasons, Vegetation Types, Elevation, Time of Day, and Insularity , 1973 .

[15]  H. L. Sanders,et al.  Marine Benthic Diversity: A Comparative Study , 1968, The American Naturalist.

[16]  W. Esty A Normal Limit Law for a Nonparametric Estimator of the Coverage of a Random Sample , 1983 .

[17]  Hongwei Huang,et al.  Turing's formula revisited* , 2007, J. Quant. Linguistics.

[18]  Anne Chao,et al.  Nonparametric prediction in species sampling , 2004 .

[19]  I. Good,et al.  THE NUMBER OF NEW SPECIES, AND THE INCREASE IN POPULATION COVERAGE, WHEN A SAMPLE IS INCREASED , 1956 .

[20]  L. Jost The Relation between Evenness and Diversity , 2010 .

[21]  Robert K. Colwell,et al.  Estimating terrestrial biodiversity through extrapolation. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[22]  S. Boneh,et al.  Estimating the Prediction Function and the Number of Unseen Species in Sampling with Replacement , 1998 .

[23]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[24]  John Alroy,et al.  Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification , 2010 .

[25]  A. Chao Species Estimation and Applications , 2006 .

[26]  Robert K. Peet,et al.  The Measurement of Species Diversity , 1974 .

[27]  A. Chao,et al.  PREDICTING THE NUMBER OF NEW SPECIES IN FURTHER TAXONOMIC SAMPLING , 2003 .

[28]  T. Olszewski A unified mathematical framework for the measurement of richness and evenness within and among multiple communities , 2004 .

[29]  Robert K. Colwell,et al.  Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages , 2012 .

[30]  J. Handelsman,et al.  Status of the Microbial Census , 2004, Microbiology and Molecular Biology Reviews.

[31]  Carlo Ricotta,et al.  On parametric evenness measures. , 2003, Journal of theoretical biology.

[32]  Anne E. Magurran,et al.  Biological Diversity: Frontiers in Measurement and Assessment , 2011 .

[33]  Daniel Simberloff,et al.  Properties of the Rarefaction Diversity Measurement , 1972, The American Naturalist.

[34]  Anne Chao,et al.  Sufficient sampling for asymptotic minimum species richness estimators. , 2009, Ecology.

[35]  L. Jost Partitioning diversity into independent alpha and beta components. , 2007, Ecology.

[36]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[37]  S. Engen,et al.  Stochastic abundance models. , 1978 .

[38]  Robert K. Colwell,et al.  Estimating the Species Accumulation Curve Using Mixtures , 2005, Biometrics.

[39]  Béla Tóthmérész,et al.  Comparison of different methods for diversity ordering , 1995 .

[40]  S. Hurlbert The Nonconcept of Species Diversity: A Critique and Alternative Parameters. , 1971, Ecology.

[41]  J. Bunge,et al.  Estimating the Number of Species: A Review , 1993 .

[42]  A. Chao Nonparametric estimation of the number of classes in a population , 1984 .

[43]  Jonathan A Coddington,et al.  Undersampling bias: the null hypothesis for singleton species in tropical arthropod surveys. , 2009, The Journal of animal ecology.

[44]  Robert K. Colwell,et al.  Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness , 2001 .

[45]  N. Starr,et al.  Optimal and Adaptive Stopping in the Search for New Species , 1979 .

[46]  Ramsés H. Mena,et al.  Bayesian Nonparametric Estimation of the Probability of Discovering New Species , 2007 .

[47]  I. Good,et al.  Turing’s anticipation of empirical bayes in connection with the cryptanalysis of the naval enigma , 2000 .

[48]  Cun-Hui Zhang,et al.  Asymptotic normality of a nonparametric estimator of sample coverage , 2009, 0908.3440.