Unveiling the species-rank abundance distribution by generalizing the Good-Turing sample coverage theory.

Based on a sample of individuals, we focus on inferring the vector of species relative abundance of an entire assemblage and propose a novel estimator of the complete species-rank abundance distribution (RAD). Nearly all previous estimators of the RAD use the conventional "plug-in" estimator Pi (sample relative abundance) of the true relative abundance pi of species i. Because most biodiversity samples are incomplete, the plug-in estimators are applied only to the subset of species that are detected in the sample. Using the concept of sample coverage and its generalization, we propose a new statistical framework to estimate the complete RAD by separately adjusting the sample relative abundances for the set of species detected in the sample and estimating the relative abundances for the set of species undetected in the sample but inferred to be present in the assemblage. We first show that P, is a positively biased estimator of pi for species detected in the sample, and that the degree of bias increases with increasing relative rarity of each species. We next derive a method to adjust the sample relative abundance to reduce the positive bias inherent in j. The adjustment method provides a nonparametric resolution to the longstanding challenge of characterizing the relationship between the true relative abundance in the entire assemblage and the observed relative abundance in a sample. Finally, we propose a method to estimate the true relative abundances of the undetected species based on a lower bound of the number of undetected species. We then combine the adjusted RAD for the detected species and the estimated RAD for the undetected species to obtain the complete RAD estimator. Simulation results show that the proposed RAD curve can unveil the true RAD and is more accurate than the empirical RAD. We also extend our method to incidence data. Our formulas and estimators are illustrated using empirical data sets from surveys of forest spiders (for abundance data) and soil ciliates (for incidence data). The proposed RAD estimator is also applicable to estimating various diversity measures and should be widely useful to analyses of biodiversity and community structure.

[1]  Elizabeth L. Sander,et al.  Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies , 2014 .

[2]  A. Chao,et al.  Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species , 2013 .

[3]  A. Chao,et al.  Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size. , 2012, Ecology.

[4]  Robert K. Colwell,et al.  Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages , 2012 .

[5]  A. Ellison,et al.  Response of macroarthropod assemblages to the loss of hemlock (Tsuga canadensis), a foundation species , 2011 .

[6]  Anne Chao,et al.  A novel statistical method for classifying habitat generalists and specialists. , 2011, Ecology.

[7]  Anne E. Magurran,et al.  Biological Diversity: Frontiers in Measurement and Assessment , 2011 .

[8]  A. Ellison,et al.  Detecting temporal trends in species assemblages with bootstrapping procedures and hierarchical models , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[9]  A. Ellison,et al.  Experimentally testing the role of foundation species in forests: the Harvard Forest Hemlock Removal Experiment , 2010 .

[10]  Joshua B Plotkin,et al.  A statistical theory for sampling species abundances. , 2007, Ecology letters.

[11]  Marti J. Anderson,et al.  Species abundance distributions: moving beyond single prediction theories to integration within an ecological framework. , 2007, Ecology letters.

[12]  D. MacKenzie Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence , 2005 .

[13]  R. Lande,et al.  When species accumulation curves intersect: implications for ranking diversity using small samples. , 2000 .

[14]  I. Good,et al.  Turing’s anticipation of empirical bayes in connection with the cryptanalysis of the naval enigma , 2000 .

[15]  A K Dewdney,et al.  A dynamical model of communities and a new species-abundance distribution. , 2000, The Biological bulletin.

[16]  A. Chao Estimating the population size for capture-recapture data with unequal catchability. , 1987, Biometrics.

[17]  G. Sugihara Minimal Community Structure: An Explanation of Species Abundance Patterns , 1980, The American Naturalist.

[18]  Hal Caswell,et al.  Community Structure: A Neutral Model Analysis , 1976 .

[19]  M. Hill Diversity and Evenness: A Unifying Notation and Its Consequences , 1973 .

[20]  R. Whittaker Evolution and measurement of species diversity , 1972 .

[21]  R. H. Whittaker,et al.  Dominance and Diversity in Land Plant Communities , 1965, Science.

[22]  R. Macarthur,et al.  On the Relative Abundance of Species , 1960, The American Naturalist.

[23]  R. Macarthur ON THE RELATIVE ABUNDANCE OF BIRD SPECIES. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[24]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[25]  F. W. Preston The Commonness, And Rarity, of Species , 1948 .

[26]  R. Fisher,et al.  The Relation Between the Number of Species and the Number of Individuals in a Random Sample of an Animal Population , 1943 .

[27]  Anne Chao,et al.  Measuring and Estimating Species Richness, Species Diversity, and Biotic Similarity from Sampling Data , 2013 .

[28]  Sabine Agatha,et al.  Soil ciliates (protozoa, ciliophora) from Namibia (Southwest Africa), with emphasis on two contrasting environments, the Etosha Region and the Namib Desert , 2002 .

[29]  Mutsunori Tokeshi,et al.  NICHE APPORTIONMENT OR RANDOM ASSORTMENT: SPECIES ABUNDANCE PATTERNS REVISITED , 1990 .

[30]  A. Chao Nonparametric estimation of the number of classes in a population , 1984 .

[31]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[32]  A. Chao,et al.  1 Estimation of Species Richness and Shared Species Richness , 2022 .