Estimating the Number of Classes in a Finite Population

Abstract We use an extension of the generalized jackknife approach of Gray and Schucany to obtain new nonparametric estimators for the number of classes in a finite population of known size. We also show that generalized jackknife estimators are closely related to certain Horvitz–Thompson estimators, to an estimator of Shlosser, and to estimators based on sample coverage. In particular, the generalized jackknife approach leads to a modification of Shlosser's estimator that does not suffer from the erratic behavior of the original estimator. The performance of both new and previous estimators is investigated by means of an asymptotic variance analysis and a Monte Carlo simulation study.

[1]  J. Tukey Memorandum on Statistics in the Federal Government: Part II, Chapters V — VII , 1949 .

[2]  L. Stein,et al.  Probability and the Weighing of Evidence , 1950 .

[3]  Leo A. Goodman,et al.  On the Analysis of Samples from $k$ Lists , 1952 .

[4]  W. Deming,et al.  On the Problem of Matching Lists by Samples , 1959 .

[5]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[6]  W. Strawderman The Generalized Jackknife Statistic , 1973 .

[7]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[8]  Rupert G. Miller The jackknife-a review , 1974 .

[9]  H. L. Gray,et al.  The Generalised Jackknife Statistic , 1974 .

[10]  Farhad Mehran,et al.  The Generalized Jackknife Statistic , 1975 .

[11]  K. Burnham,et al.  Estimation of the size of a closed population when capture probabilities vary among animals , 1978 .

[12]  K. Burnham,et al.  Robust Estimation of Population Size When Capture Probabilities Vary Among Animals , 1979 .

[13]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[14]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[15]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[16]  L. Holst Some Asymptotic Results for Incomplete Multinomial or Poisson Samples , 1981 .

[17]  Erol Gelenbe,et al.  On the Size of Projections: I , 1982, Inf. Process. Lett..

[18]  J. Heltshe,et al.  Estimating species richness using the jackknife procedure. , 1983, Biometrics.

[19]  G. Belle,et al.  Nonparametric estimation of species richness , 1984 .

[20]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[21]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[22]  Kyu-Young Whang,et al.  Approximating the number of unique values of an attribute without sorting , 1987, Inf. Syst..

[23]  Wen-Chi Hou,et al.  Statistical estimators for relational algebra expressions , 1988, PODS '88.

[24]  M. V. Ramakrishna,et al.  Practical performance of Bloom filters and parallel free-text searching , 1989, CACM.

[25]  Wen-Chi Hou,et al.  Processing aggregate relational queries with hard time constraints , 1989, SIGMOD '89.

[26]  Jeffrey F. Naughton,et al.  On Estimating the Size of Projections , 1990, ICDT.

[27]  Kyu-Young Whang,et al.  A linear-time probabilistic counting algorithm for database applications , 1990, TODS.

[28]  Wen-Chi Hou,et al.  On Estimating COUNT, SUM, and AVERAGE , 1991, International Conference on Database and Expert Systems Applications.

[29]  Kaizheng Du,et al.  On Estimating COUNT, SUM, and AVERAGE Relational Algebra Queries , 1991 .

[30]  A. Chao,et al.  Estimating the Number of Classes via Sample Coverage , 1992 .

[31]  Michael Stonebraker,et al.  Predicate migration: optimizing queries with expensive predicates , 1992, SIGMOD Conference.

[32]  J. Bunge,et al.  Estimating the Number of Species: A Review , 1993 .

[33]  A. Chao,et al.  Stopping rules and estimation for recapture debugging with unequal failure rates , 1993 .

[34]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[35]  Kosuke Imai,et al.  Survey Sampling , 1998, Nov/Dec 2017.