Second-order asymptotic theory for calibration estimators in sampling and missing-data problems

Consider three different but related problems with auxiliary information: infinite population sampling or Monte Carlo with control variates, missing response with explanatory variables, and Poisson and rejective sampling with auxiliary variables. We demonstrate unified regression and likelihood estimators and study their second-order properties. The likelihood estimators are second-order unbiased but the regression estimators are not. For the missing-data problem and survey sampling, no estimator studied always has the smallest second-order variance even after bias correction. However, the calibrated likelihood estimator and bias-corrected, calibrated regression estimator are second-order more efficient than other bias-corrected estimators if a linear model holds for the conditional expectation of the response or study variable given explanatory or auxiliary variables.

[1]  G. Shorack Probability for Statisticians , 2000 .

[2]  L. Hansen LARGE SAMPLE PROPERTIES OF GENERALIZED METHOD OF , 1982 .

[3]  William G. Cochran,et al.  Sampling Techniques, 3rd Edition , 1963 .

[4]  Lih-Yuan Deng,et al.  Estimation of Variance of the Regression Estimator , 1987 .

[5]  C. L. Mallows Some comments on C_p , 1973 .

[6]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[7]  Mark J van der Laan,et al.  Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis , 2008, The international journal of biostatistics.

[8]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[9]  Calyampudi R. Rao Criteria of estimation in large samples , 1965 .

[10]  C. Mallows More comments on C p , 1995 .

[11]  Jiahua Chen,et al.  Empirical likelihood estimation for ?nite populations and the e?ective usage of auxiliary informatio , 1993 .

[12]  C. T. Isaki,et al.  SURVEY DESIGN UNDER SUPERPOPULATION MODELS , 1981 .

[13]  Whitney K. Newey,et al.  Higher Order Properties of Gmm and Generalized Empirical Likelihood Estimators , 2003 .

[14]  Jun S. Liu,et al.  Weighted finite population sampling to maximize entropy , 1994 .

[15]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[16]  Rémi Bardenet,et al.  Monte Carlo Methods , 2013, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[17]  J. Lawless,et al.  Empirical Likelihood and General Estimating Equations , 1994 .

[18]  P. Glynn,et al.  Some New Perspectives on the Method of Control Variates , 2002 .

[19]  Zhiqiang Tan,et al.  On a Likelihood Approach for Monte Carlo Integration , 2004 .

[20]  A. Winsor Sampling techniques. , 2000, Nursing times.

[21]  Thomas J. Rothenberg,et al.  Approximating the distributions of econometric estimators and test statistics , 1984 .

[22]  G. Montanari Post-sampling efficient QR-prediction in large-sample surveys , 1987 .

[23]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[24]  Zhiqiang Tan Comment: Improved Local Efficiency and Double Robustness , 2008, The international journal of biostatistics.

[25]  J. Hájek,et al.  Sampling from a finite population , 1982 .

[26]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[27]  Wolfgang Wefelmeyer,et al.  A third-order optimum property of the maximum likelihood estimator , 1978 .

[28]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[29]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[30]  B. Efron Defining the Curvature of a Statistical Problem (with Applications to Second Order Efficiency) , 1975 .

[31]  C. Särndal,et al.  Calibration Estimators in Survey Sampling , 1992 .

[32]  A. U.S. Efficient restricted estimators for conditional mean models with missing data , .

[33]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[34]  Pamela A Shaw,et al.  Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data , 2011, International statistical review = Revue internationale de statistique.

[35]  J. Robins,et al.  Comment: Performance of Double-Robust Estimators When “Inverse Probability” Weights Are Highly Variable , 2007, 0804.2965.

[36]  J. Hájek Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population , 1964 .

[37]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[38]  Zhiqiang Tan,et al.  ‘Simple design-efficient calibration estimators for rejective and high-entropy sampling’ , 2013 .

[39]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .

[40]  J. N. K. Rao,et al.  Developments in sample survey theory: An appraisal , 1997 .

[41]  Changbao Wu,et al.  A Model-Calibration Approach to Using Complete Auxiliary Information From Survey Data , 2001 .

[42]  Yves Tillé,et al.  Sampling Algorithms , 2011, International Encyclopedia of Statistical Science.

[43]  Zhiqiang Tan,et al.  Nonparametric likelihood and doubly robust estimating equations for marginal and nested structural models , 2010 .

[44]  Lih-Yuan Deng,et al.  On asymptotically design-unbiased estimators of a finite population mean , 1991 .

[45]  Rory A. Fisher,et al.  Theory of Statistical Estimation , 1925, Mathematical Proceedings of the Cambridge Philosophical Society.

[46]  Zhiqiang Tan,et al.  Comment: Understanding OR, PS and DR , 2007, 0804.2969.

[47]  Jin Xiang Generalized Empirical Likelihood Estimators , 2013 .

[48]  Zhiqiang Tan,et al.  A Distributional Approach for Causal Inference Using Propensity Scores , 2006 .

[49]  J. Robins,et al.  Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. , 1997, Statistics in medicine.