Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data

Survey calibration (or generalized raking) estimators are a standard approach to the use of auxiliary information in survey sampling, improving on the simple Horvitz-Thompson estimator. In this paper we relate the survey calibration estimators to the semiparametric incomplete-data estimators of Robins and coworkers, and to adjustment for baseline variables in a randomized trial. The development based on calibration estimators explains the 'estimated weights' paradox and provides useful heuristics for constructing practical estimators. We present some examples of using calibration to gain precision without making additional modelling assumptions in a variety of regression models.

[1]  Thomas Lumley,et al.  Improved Horvitz–Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology , 2009, Statistics in biosciences.

[2]  M. Davidian,et al.  Covariate adjustment for two‐sample treatment comparisons in randomized clinical trials: A principled yet flexible approach , 2008, Statistics in medicine.

[3]  M. Singer,et al.  Nutritional Epidemiology , 2020, Definitions.

[4]  Alex P. Reiner,et al.  Diuretic Therapy, the α-Adducin Gene Variant, and the Risk of Myocardial Infarction or Stroke in Persons With Treated Hypertension , 2002 .

[5]  Antonio Ciampi,et al.  Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments , 2002, Public Health Nutrition.

[6]  A. Winsor Sampling techniques. , 2000, Nursing times.

[7]  A. Scott,et al.  On the robustness of weighted methods for fitting models to case–control data , 2002 .

[8]  Victor Kipnis,et al.  Could exposure assessment problems give us wrong answers to nutrition and cancer questions? , 2004, Journal of the National Cancer Institute.

[9]  S A Bingham,et al.  Urine nitrogen as an independent validatory measure of dietary intake: a study of nitrogen balance in individuals consuming their normal diet. , 1985, The American journal of clinical nutrition.

[10]  Danyu Lin,et al.  On fitting Cox's proportional hazards models to survey data , 2000 .

[11]  Steven G. Self,et al.  Asymptotic Distribution Theory and Efficiency Results for Case-Cohort Studies , 1988 .

[12]  Raymond J Carroll,et al.  Structure of dietary measurement error: results of the OPEN biomarker study. , 2003, American journal of epidemiology.

[13]  D A Schoeller,et al.  Measurement of energy expenditure in humans by doubly labeled water method. , 1982, Journal of applied physiology: respiratory, environmental and exercise physiology.

[14]  S. Eguchi,et al.  A paradox concerning nuisance parameters and projected estimating functions , 2004 .

[15]  Jack A. Taylor,et al.  Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. , 1994, Statistics in medicine.

[16]  D. Pierce The Asymptotic Effect of Substituting Estimators for Parameters in Certain Types of Statistics , 1982 .

[17]  C Y Wang,et al.  Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease , 2002, Public Health Nutrition.

[18]  Marie Davidian,et al.  Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates , 2008, Biometrics.

[19]  Yijian Huang,et al.  ERRORS-IN-COVARIATES EFFECT ON ESTIMATING FUNCTIONS: ADDITIVITY IN LIMIT AND NONPARAMETRIC CORRECTION , 2006 .

[20]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[21]  A. T. Dinh-Xuan,et al.  From the authors , 2005, European Respiratory Journal.

[22]  Michal Kulich,et al.  Improving the Efficiency of Relative-Risk Estimation in Case-Cohort Studies , 2004 .

[23]  Bryan Langholz,et al.  Exposure Stratified Case-Cohort Designs , 2000, Lifetime data analysis.

[24]  D. Binder On the variances of asymptotically normal estimators from complex surveys , 1983 .

[25]  Carl-Erik Särndal,et al.  Borrowing Strength Is Not the Best Technique Within a Wide Class of Design-Consistent Domain Estimators , 2005 .

[26]  Paul Zador,et al.  Variable selection and raking in propensity scoring. , 2007, Statistics in medicine.

[27]  C. T. Isaki,et al.  Survey Design under the Regression Superpopulation Model , 1982 .

[28]  Carl-Erik Särndal,et al.  Generalized Raking Procedures in Survey Sampling , 1993 .

[29]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[30]  T. Nakamura,et al.  Proportional hazards model with covariates subject to measurement error. , 1992, Biometrics.

[31]  J. Rao,et al.  Inference From Stratified Samples: Properties of the Linearization, Jackknife and Balanced Repeated Replication Methods , 1981 .

[32]  Wenxin Jiang,et al.  Parameterization and inference for nonparametric regression problems , 2001 .

[33]  R. Prentice,et al.  Measurement error and results from analytic epidemiology: dietary fat and breast cancer. , 1996, Journal of the National Cancer Institute.

[34]  Yijian Huang,et al.  Cox Regression with Accurate Covariates Unascertainable: A Nonparametric-Correction Approach , 2000 .

[35]  R. Prentice Covariate measurement errors and parameter estimation in a failure time regression model , 1982 .

[36]  R J Carroll,et al.  Empirical evidence of correlated biases in dietary assessment instruments and its implications. , 2001, American journal of epidemiology.

[37]  Bernard W. Silverman International Statistical Review , 1996 .

[38]  Nilanjan Chatterjee,et al.  Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies , 2005 .

[39]  V. Kipnis,et al.  A new class of measurement‐error models, with applications to dietary data , 1998 .

[40]  Hormuzd A. Katki,et al.  Specifying and Implementing Nonparametric and Semiparametric Survival Estimators in Two-Stage (Nested) Cohort Studies With Missing Case Data , 2006 .

[41]  Norman E. Breslow,et al.  A Z‐theorem with Estimated Nuisance Parameters and Correction Note for ‘Weighted Likelihood for Semiparametric Models and Two‐phase Stratified Samples, with Application to Cox Regression’ , 2008 .

[42]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[43]  Anastasios A. Tsiatis,et al.  A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error , 2001 .

[44]  R. L. Prentice,et al.  A case-cohort design for epidemiologic cohort studies and disease prevention trials , 1986 .

[45]  C. Särndal,et al.  Calibration Estimators in Survey Sampling , 1992 .

[46]  David Firth,et al.  Robust models in probability sampling , 1998 .

[47]  B. Nan,et al.  Efficient estimation for case-cohort studies , 2002 .

[48]  Thomas Lumley,et al.  Complex Surveys: A Guide to Analysis Using R , 2010 .