Towards Practical Mean Bounds for Small Samples

Historically, to bound the mean for small sample sizes, practitioners have had to choose between using methods with unrealistic assumptions about the unknown distribution (e.g., Gaussianity) and methods like Hoeffding’s inequality that use weaker assumptions but produce much looser (wider) intervals. In 1969, Anderson (1969a) proposed a mean confidence interval strictly better than or equal to Hoeffding’s whose only assumption is that the distribution’s support is contained in an interval [a, b]. For the first time since then, we present a new family of bounds that compares favorably to Anderson’s. We prove that each bound in the family has guaranteed coverage, i.e., it holds with probability at least 1 − α for all distributions on an interval [a, b]. Furthermore, one of the bounds is tighter than or equal to Anderson’s for all samples. In simulations, we show that for many distributions, the gain over Anderson’s bound is substantial.

[1]  Mame Astou Diouf,et al.  Improved Nonparametric Inference for the Mean of a Bounded Random Variable with Application to Poverty Measures , 2005 .

[2]  John E. Angus,et al.  The Probability Integral Transform and Related Results , 1994, SIAM Rev..

[3]  Philip S. Thomas,et al.  High-Confidence Off-Policy Evaluation , 2015, AAAI.

[4]  Massimiliano Pontil,et al.  Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[5]  M.C.A. van Zuijlen,et al.  The Stringer Bound in Case of Uniform Taintings , 1995 .

[6]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[7]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[8]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[9]  Norbert Gaffke Three test statistics for a nonparametric one-sided hypothesis on the mean of a nonnegative variable , 2004 .

[10]  T. W. Anderson CONFIDENCE LIMITS FOR THE EXPECTED VALUE OF AN ARBITRARY BOUNDED RANDOM VARIABLE WITH A CONTINUOUS DISTRIBUTION FUNCTION , 1969 .

[11]  Erik G. Learned-Miller,et al.  A Probabilistic Upper Bound on Differential Entropy , 2005, IEEE Transactions on Information Theory.

[12]  C. H. Evans,et al.  Small Clinical Trials: Issues and Challenges , 2001 .

[13]  Aaditya Ramdas,et al.  Estimating means of bounded random variables by betting , 2020 .

[14]  S. Fienberg,et al.  Estimating the Total Overstatement Error in Accounting Populations , 1977 .

[15]  G. Bennett Probability Inequalities for the Sum of Independent Random Variables , 1962 .

[16]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[17]  Joseph P. Romano Finite sample nonparametric inference and large sample efficiency , 1998 .

[18]  E. S. Pearson,et al.  THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[19]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .

[20]  Philip S. Thomas,et al.  A New Confidence Interval for the Mean of a Bounded Random Variable , 2019, ArXiv.