Divide and conquer in nonstandard problems and the super-efficiency phenomenon

We study how the divide and conquer principle --- partition the available data into subsamples, compute an estimate from each subsample and combine these appropriately to form the final estimator --- works in non-standard problems where rates of convergence are typically slower than $\sqrt{n}$ and limit distributions are non-Gaussian, with a special emphasis on the least squares estimator (and its inverse) of a monotone regression function. We find that the pooled estimator, obtained by averaging non-standard estimates across the mutually exclusive subsamples, outperforms the non-standard estimator based on the entire sample in the sense of pointwise inference. We also show that, under appropriate conditions, if the number of subsamples is allowed to increase at appropriate rates, the pooled estimator is asymptotically normally distributed with a variance that is empirically estimable from the subsample-level estimates. Further, in the context of monotone function estimation we show that this gain in pointwise efficiency comes at a price --- the pooled estimator's performance, in a uniform sense (maximal risk) over a class of models worsens as the number of subsamples increases, leading to a version of the super-efficiency phenomenon. In the process, we develop analytical results for the order of the bias in isotonic regression, which are of independent interest.

[1]  D. Pollard,et al.  Cube Root Asymptotics , 1990 .

[2]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[3]  C. Durot Monotone nonparametric regression with random design , 2008 .

[4]  Runze Li,et al.  Statistical inference in massive data sets , 2012 .

[5]  Jian Huang,et al.  Estimation of a Monotone Density or Monotone Hazard Under Random Censoring , 1995 .

[6]  Bootstrapping the shorth for regression , 2006 .

[7]  P. Révész,et al.  Strong Approximations of the Quantile Process , 1978 .

[8]  U. Grenander On the theory of mortality measurement , 1956 .

[9]  H. Chernoff Estimation of the mode , 1964 .

[10]  H. Barnett A Theory of Mortality , 1968 .

[11]  H. D. Brunk Maximum Likelihood Estimates of Monotone Parameters , 1955 .

[12]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .

[13]  F. T. Wright The Asymptotic Behavior of Monotone Regression Estimates , 1981 .

[14]  Prakasa Rao Estimation of a unimodal density , 1969 .

[15]  Hendrik P. Lopuhaä,et al.  Asymptotic normality of the $L_1$ error of the Grenander estimator , 1999 .

[16]  Ian W. McKeague,et al.  Confidence sets for split points in decision trees , 2007 .

[17]  C. Durot,et al.  A Kiefer-Wolfowitz type of result in a general setting, with an application to smooth monotone estimation , 2013, 1308.0417.

[18]  J. Wellner,et al.  Information Bounds and Nonparametric Maximum Likelihood Estimation , 1992 .

[19]  Cécile Durot,et al.  Sharp asymptotics for isotonic regression , 2002 .

[20]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[21]  Moulinath Banerjee,et al.  ESTIMATING MONOTONE, UNIMODAL AND U-SHAPED FAILURE RATES USING ASYMPTOTIC PIVOTS , 2008 .

[22]  Geurt Jongbloed,et al.  Nonparametric Estimation under Shape Constraints: Estimators, Algorithms and Asymptotics , 2014 .

[23]  F. T. Wright,et al.  Order restricted statistical inference , 1988 .

[24]  Han Liu,et al.  A PARTIALLY LINEAR FRAMEWORK FOR MASSIVE HETEROGENEOUS DATA. , 2014, Annals of statistics.

[25]  Martin J. Wainwright,et al.  Divide and Conquer Kernel Ridge Regression , 2013, COLT.

[26]  C. Manski MAXIMUM SCORE ESTIMATION OF THE STOCHASTIC UTILITY MODEL OF CHOICE , 1975 .

[27]  A. I. Sakhanenko Estimates in the invariance principle in terms of truncated power moments , 2006 .

[28]  M. Banerjee Likelihood based inference for monotone response models , 2007, 0708.2177.

[29]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[30]  J. Wellner,et al.  Confidence Intervals for Current Status Data , 2005 .

[31]  Chengchun Shi,et al.  A Massive Data Framework for M-Estimators with Cubic-Rate , 2016, Journal of the American Statistical Association.

[32]  Liuhua Peng,et al.  STATISTICAL INFERENCE FOR MASSIVE DATA By , 2021 .

[33]  J. Wellner,et al.  Computing Chernoff's Distribution , 2001 .

[34]  Lawrence D. Brown,et al.  Superefficiency in Nonparametric Function Estimation , 1997 .

[35]  M. Yor,et al.  Continuous martingales and Brownian motion , 1990 .

[36]  P. Groeneboom Vertices of the Least Concave Majorant of Brownian Motion with Parabolic Drift , 2010, 1011.0028.

[37]  M. Banerjee Likelihood Ratio Tests Under Local and Fixed Alternatives in Monotone Function Problems , 2005 .