Minimax risk for missing mass estimation

The problem of estimating the missing mass or total probability of unseen elements in a sequence of n random samples is considered under the squared error loss function. The worst-case risk of the popular Good-Turing estimator is shown to be between 0.6080/n and 0.6179/n. The minimax risk is shown to be lower bounded by 0.25/n. This appears to be the first such published result on minimax risk for estimation of missing mass, which has several practical and theoretical applications.

[1]  Robert K. Colwell,et al.  Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages , 2012 .

[2]  David A. McAllester,et al.  On the Convergence Rate of Good-Turing Estimators , 2000, COLT.

[3]  A. Chao,et al.  Estimating the Number of Classes via Sample Coverage , 1992 .

[4]  Alon Orlitsky,et al.  On Learning Distributions from their Samples , 2015, COLT.

[5]  Stanley F. Chen,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[6]  Mesrob I. Ohannessian,et al.  Concentration inequalities in the infinite urn scheme for occupancy counts and the missing mass, with applications , 2014, 1412.8652.

[7]  D. Berend,et al.  On the concentration of the missing mass , 2012, 1210.3248.

[8]  Sanjeev R. Kulkarni,et al.  Strong Consistency of the Good-Turing Estimator , 2006, 2006 IEEE International Symposium on Information Theory.

[9]  A. Chao,et al.  PREDICTING THE NUMBER OF NEW SPECIES IN FURTHER TAXONOMIC SAMPLING , 2003 .

[10]  Alon Orlitsky,et al.  Optimal Probability Estimation with Applications to Prediction and Classification , 2013, COLT.

[11]  Bin Yu,et al.  Coverage-adjusted entropy estimation. , 2007, Statistics in medicine.

[12]  William A. Gale,et al.  Good-Turing Frequency Estimation Without Tears , 1995, J. Quant. Linguistics.

[13]  Sanjeev R. Kulkarni,et al.  A Better Good-Turing Estimator for Sequence Probabilities , 2007, 2007 IEEE International Symposium on Information Theory.

[14]  Yishay Mansour,et al.  Concentration Bounds for Unigrams Language Model , 2005, COLT.

[15]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[16]  Munther A. Dahleh,et al.  Rare Probability Estimation under Regularly Varying Heavy Tails , 2012, COLT.

[17]  Alon Orlitsky,et al.  Always Good Turing: Asymptotically Optimal Probability Estimation , 2003, Science.

[18]  Rafail E. Krichevskiy,et al.  Laplace's Law of Succession and Universal Encoding , 1998, IEEE Trans. Inf. Theory.

[19]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[20]  Alon Orlitsky,et al.  Competitive Distribution Estimation: Why is Good-Turing Good , 2015, NIPS.