The HulC: Confidence Regions from Convex Hulls

We develop and analyze the HulC, an intuitive and general method for constructing confidence sets using the convex hull of estimates constructed from subsets of the data. Unlike classical methods which are based on estimating the (limiting) distribution of an estimator, the HulC is often simpler to use and effectively bypasses this step. In comparison to the bootstrap, the HulC requires fewer regularity conditions and succeeds in many examples where the bootstrap provably fails. Unlike subsampling, the HulC does not require knowledge of the rate of convergence of the estimators on which it is based. The validity of the HulC requires knowledge of the (asymptotic) median-bias of the estimators. We further analyze a variant of our basic method, called the Adaptive HulC, which is fully data-driven and estimates the median-bias using subsampling. We show that the Adaptive HulC retains the aforementioned strengths of the HulC. In certain cases where the underlying estimators are pathologically asymmetric the HulC and Adaptive HulC can fail to provide useful confidence sets. We propose a final variant, the Unimodal HulC, which can salvage the situation in cases where the distribution of the underlying estimator is (asymptotically) unimodal. We discuss these methods in the context of several challenging inferential problems which arise in parametric, semi-parametric, and non-parametric inference. Although our focus is on validity under weak regularity conditions, we also provide some general results on the width of the HulC confidence sets, showing that in many cases the HulC confidence sets have near-optimal width.

[1]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[2]  J. G. Wendel A Problem in Geometric Probability. , 1962 .

[3]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[4]  A. Birnbaum MEDIAN-UNBIASED ESTIMATORS , 1964 .

[5]  Douglas S. Robson,et al.  Estimation of a truncation point , 1964 .

[6]  Pranab Kumar Sen,et al.  Asymptotic Normality of Sample Quantiles for $m$-Dependent Processes , 1968 .

[7]  R. H. Rodine,et al.  Estimation of the population median , 1969 .

[8]  J. A. Hartigan,et al.  Using Subsample Values as Typical Values , 1969 .

[9]  Some Numerical Comparisons of Several Approximations to the Binomial Distribution , 1969 .

[10]  J. Pfanzagl Median unbiased estimates for M.L.R.-families , 1970 .

[11]  Rudolf Borges Eine Approximation der Binomialverteilung durch die Normalverteilung der Ordnung 1/n , 1970 .

[12]  J. Pfanzagl On the Asymptotic Efficiency of Median Unbiased Estimates , 1970 .

[13]  J. A. Hartigan Exact Confidence Intervals in Regression Problems with Independent Symmetric Errors , 1970 .

[14]  J. Pfanzagl TheBerry-Esseen bound for minimum contrast estimates , 1971 .

[15]  J. Pfanzagl On median unbiased estimates , 1972 .

[16]  J. Pfanzagl The accuracy of the normal approximation for estimates of vector parameters , 1973 .

[17]  J. Pfanzagl Asymptotic Expansions Related to Minimum Contrast Estimators , 1973 .

[18]  S. John Median-unbiased most acceptable estimates of poisson, binomial and negative-binomial distributions , 1974 .

[19]  J. Lanke Interval Estimation of a Median , 1974 .

[20]  A. Cohen,et al.  A Complete Class Theorem for Strict Monotone Likelihood Ratio With Applications , 1976 .

[21]  J. Pfanzagl On Optimal Median Unbiased Estimators in the Presence of Nuisance Parameters , 1979 .

[22]  F. T. Wright The Asymptotic Behavior of Monotone Regression Estimates , 1981 .

[23]  Bradley Efron Transformation Theory: How Normal is a Family of Distributions? , 1982 .

[24]  Peter Hall,et al.  On Estimating the Endpoint of a Distribution , 1982 .

[25]  R. Hogg,et al.  On adaptive estimation , 1984 .

[26]  P. Hall On the Bootstrap and Confidence Intervals , 1986 .

[27]  K. Athreya BOOTSTRAP OF THE MEAN IN THE INFINITE VARIANCE CASE , 1987 .

[28]  P. Phillips,et al.  Best Median Unbiased Estimation in Linear Regression with Bounded Asymmetric Loss Functions , 1987 .

[29]  P. Hall Theoretical Comparison of Bootstrap Confidence Intervals , 1988 .

[30]  Keith Knight,et al.  On the Bootstrap of the Sample Mean in the Infinite Variance Case , 1989 .

[31]  Anastasios A. Tsiatis,et al.  Median Unbiased Estimation for Binary Data , 1989 .

[32]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[33]  M. Chavance [Jackknife and bootstrap]. , 1992, Revue d'epidemiologie et de sante publique.

[34]  Enno Mammen,et al.  Bootstrap, wild bootstrap, and asymptotic normality , 1992 .

[35]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[36]  S E Vollset,et al.  Confidence intervals for a binomial proportion. , 1994, Statistics in medicine.

[37]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[38]  J. Pfanzagl Parametric Statistical Theory , 1994 .

[39]  Joseph P. Romano,et al.  Large Sample Confidence Regions Based on Subsamples under Minimal Assumptions , 1994 .

[40]  K. Hamza The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions , 1995 .

[41]  Q. Shao Self-normalized large deviations , 1997 .

[42]  Geoffrey S. Watson,et al.  Simulation methods for mean and median bias reduction in parametric estimation , 1997 .

[43]  F. Götze,et al.  A Berry–Esséen Bound for M‐estimators , 1997 .

[44]  E. Mammen The Bootstrap and Edgeworth Expansion , 1997 .

[45]  Joseph P. Romano,et al.  Subsampling inference for the mean in the heavy-tailed case , 1999 .

[46]  Joseph P. Romano,et al.  On Subsampling Estimators with Unknown Rate of Convergence , 1999 .

[47]  D. Andrews Inconsistency of the Bootstrap when a Parameter is on the Boundary of the Parameter Space , 2000 .

[48]  L. Brown,et al.  Interval Estimation for a Binomial Proportion , 2001 .

[49]  Patrice Bertail,et al.  Extrapolation of subsampling distribution estimators: The i.i.d. and strong mixing cases , 2001 .

[50]  Emo Welzl,et al.  A Continuous Analogue of the Upper Bound Theorem , 2001, Discret. Comput. Geom..

[51]  T. Tony Cai,et al.  Confidence Intervals for a binomial proportion and asymptotic expansions , 2002 .

[52]  Bing-Yi Jing,et al.  Self-normalized Cramér-type large deviations for independent random variables , 2003 .

[53]  V. Bentkus,et al.  A Lyapunov type bound in $R^d$@@@A Lyapunov-type bound in $R^d$ , 2004 .

[54]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[55]  V. Bentkus A Lyapunov-type Bound in Rd , 2005 .

[56]  G. Imbens,et al.  On the Failure of the Bootstrap for Matching Estimators , 2006 .

[57]  Stephen M. Stigler,et al.  c ○ Institute of Mathematical Statistics, 2007 The Epic Story of Maximum Likelihood , 2022 .

[58]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[59]  C. Durot Monotone nonparametric regression with random design , 2008 .

[60]  D. Andrews,et al.  ASYMPTOTIC SIZE AND A PROBLEM WITH SUBSAMPLING AND WITH THE m OUT OF n BOOTSTRAP , 2009, Econometric Theory.

[61]  K. Hirano,et al.  Impossibility Results for Nondifferentiable Functionals , 2012 .

[62]  David Firth,et al.  Bias reduction in exponential family nonlinear models , 2009 .

[63]  Han-Ying Liang,et al.  BerryEsseen type bounds in heteroscedastic semi-parametric model , 2011 .

[64]  S. Boucheron,et al.  Concentration inequalities for order statistics , 2012, 1207.7209.

[65]  Mehryar Mohri,et al.  Tight Lower Bound on the Probability of a Binomial Exceeding its Expectation , 2013, ArXiv.

[66]  A. Rinaldo,et al.  Bootstrapping and sample splitting for high-dimensional, assumption-lean inference , 2016, The Annals of Statistics.

[67]  Alessandra Salvan,et al.  Median bias reduction of maximum likelihood estimates , 2016, 1604.04768.

[68]  I. Pinelis Optimal-order uniform and nonuniform bounds on the rate of convergence to normality for maximum likelihood estimators , 2017 .

[69]  J. Pfanzagl Optimality of Unbiased Estimators: Nonasymptotic Theory , 2017 .

[70]  Z. Kabluchko,et al.  Expected volumes of Gaussian polytopes, external angles, and multiple order statistics , 2017, Transactions of the American Mathematical Society.

[71]  Adityanand Guntuboyina,et al.  Nonparametric Shape-Restricted Regression , 2017, Statistical Science.

[72]  Benjamin Doerr,et al.  An Elementary Analysis of the Probability That a Binomial Random Variable Exceeds Its Expectation , 2017, Statistics & Probability Letters.

[73]  Andrés Santos,et al.  Inference on Directionally Differentiable Functions , 2014, The Review of Economic Studies.

[74]  L. Wasserman,et al.  Gaussian Mixture Clustering Using Relative Tests of Fit. , 2019, 1910.02566.

[75]  Kai Zhang,et al.  Models as Approximations I: Consequences Illustrated with Linear Regression , 2014, Statistical Science.

[76]  Kengo Kato,et al.  Berry–Esseen bounds for Chernoff-type nonstandard asymptotics in isotonic regression , 2019, The Annals of Applied Probability.

[77]  B. Sen,et al.  Inference for Local Parameters in Convexity Constrained Models , 2020, Journal of the American Statistical Association.

[78]  Larry Wasserman,et al.  Universal inference , 2019, Proceedings of the National Academy of Sciences.

[79]  Nicola Sartori,et al.  Mean and median bias reduction in generalized linear models , 2018, Stat. Comput..

[80]  Hang Deng Slightly Conservative Bootstrap for Maxima of Sums. , 2020, 2007.15877.

[81]  Yuta Koike Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles , 2019, Japanese Journal of Statistics and Data Science.

[82]  V. Chernozhukov,et al.  Nearly optimal central limit theorem and bootstrap approximations in high dimensions , 2020, The Annals of Applied Probability.

[83]  Vladimir Koltchinskii,et al.  Efficient estimation of smooth functionals in Gaussian shift models , 2018, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[84]  Cun-Hui Zhang,et al.  Confidence intervals for multiple isotonic regression and other monotone models , 2020, The Annals of Statistics.

[85]  Yuta Koike,et al.  High-dimensional central limit theorems by Stein’s method , 2020, The Annals of Applied Probability.

[86]  Vladimir Koltchinskii,et al.  Estimation of smooth functionals in normal models: Bias reduction and asymptotic efficiency , 2019, The Annals of Statistics.

[87]  Terry Lyons,et al.  Estimating the probability that a given vector is in the convex hull of a random sample , 2021 .

[88]  K. Knight Second order improvements of sample quantiles using subsamples , 2022 .