Semiparametric efficiency in GMM models with auxiliary data

We study semiparametric efficiency bounds and efficient estimation of parameters defined through general moment restrictions with missing data. Identification relies on auxiliary data containing information about the distribution of the missing variables conditional on proxy variables that are observed in both the primary and the auxiliary database, when such distribution is common to the two data sets. The auxiliary sample can be independent of the primary sample, or can be a subset of it. For both cases, we derive bounds when the probability of missing data given the proxy variables is unknown, or known, or belongs to a correctly specified parametric family. We find that the conditional probability is not ancillary when the two samples are independent. For all cases, we discuss efficient semiparametric estimators. An estimator based on a conditional expectation projection is shown to require milder regularity conditions than one based on inverse probability weighting.

[1]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[2]  W. J. Hall,et al.  Information and Asymptotic Efficiency in Parametric-Nonparametric Models , 1983 .

[3]  A. Gallant,et al.  Semi-nonparametric Maximum Likelihood Estimation , 1987 .

[4]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[5]  Raymond J. Carroll,et al.  Semiparametric Estimation in Logistic Measurement Error Models , 1989 .

[6]  W. Newey,et al.  Semiparametric Efficiency Bounds , 1990 .

[7]  Donald B. Rubin,et al.  Multiple Imputation of Industry and Occupation Codes in Census Public-use Samples Using Bayesian Logistic Regression , 1991 .

[8]  J. Robins,et al.  Estimating exposure effects by modelling the expectation of exposure conditional on confounders. , 1992, Biometrics.

[9]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[10]  Raymond J. Carroll,et al.  Semiparametric quasilikelihood and variance function estimation in measurement error models , 1993 .

[11]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[12]  James L. Powell,et al.  Estimation of semiparametric models , 1994 .

[13]  W. Newey,et al.  The asymptotic variance of semiparametric estimators , 1994 .

[14]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[15]  Lung-fei Lee,et al.  Estimation of Linear and Nonlinear Errors-in-Variables Models Using Validation Data , 1995 .

[16]  J. Robins,et al.  Semiparametric Efficiency in Multivariate Regression Models with Missing Data , 1995 .

[17]  J. Robins,et al.  Semiparametric regression estimation in the presence of dependent censoring , 1995 .

[18]  D. Ruppert,et al.  Measurement Error in Nonlinear Models , 1995 .

[19]  Xiaotong Shen,et al.  On methods of sieves and penalization , 1997 .

[20]  Xiaotong Shen,et al.  Sieve extremum estimates for weakly dependent data , 1998 .

[21]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[22]  J. Heckman,et al.  The Economics and Econometrics of Active Labor Market Programs , 1999 .

[23]  J. Robins,et al.  On the semi-parametric efficiency of logistic regression under case-control sampling , 2000 .

[24]  G. Imbens,et al.  Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2000 .

[25]  G. Imbens,et al.  Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2002 .

[26]  Jeffrey M. Wooldridge,et al.  Inverse probability weighted M-estimators for sample selection, attrition, and stratification , 2002 .

[27]  Jean Drèze,et al.  Poverty and Inequality in India: A Re-Examination , 2002 .

[28]  Angus Deaton Adjusted Indian Poverty Estimates for 1999-2000 , 2003 .

[29]  Oliver Linton,et al.  Semiparametric Regression Analysis With Missing Response at Random , 2003 .

[30]  J. Wooldridge Inverse probability weighted estimation for general missing data problems , 2004 .

[31]  Norman E. Breslow,et al.  Large Sample Theory for Semiparametric Regression Models with Two-Phase, Outcome Dependent Sampling , 2003 .

[32]  Guido W. Imbens,et al.  EFFICIENT ESTIMATION OF AVERAGE TREATMENT EFFECTS , 2003 .

[33]  Xiaohong Chen,et al.  Estimation of Semiparametric Models When the Criterion Function is Not Smooth , 2002 .

[34]  Xiaohong Chen,et al.  Efficient Estimation of Models with Conditional Moment Restrictions Containing Unknown Functions , 2003 .

[35]  Nathaniel Schenker Assessing Variability Due to Race Bridging , 2003 .

[36]  Norman E. Breslow,et al.  Semiparametric efficient estimation for the auxiliary outcome problem with the conditional mean model , 2004 .

[37]  Russell V. Lenth,et al.  Statistical Analysis With Missing Data (2nd ed.) (Book) , 2004 .

[38]  Alessandro Tarozzi Calculating Comparable Statistics From Incomparable Surveys, With an Application to Poverty in India , 2004 .

[39]  G. Imbens,et al.  Mean-Squared-Error Calculations for Average Treatment Effects , 2005 .

[40]  Angus Deaton,et al.  Data and Dogma: The Great Indian Poverty Debate , 2004 .

[41]  Han Hong,et al.  Measurement Error Models with Auxiliary Data , 2005 .

[42]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .