Model-Assisted Estimation for Complex Surveys Using Penalized Splines

Estimation of finite population totals in the presence of auxiliary information is considered. A class of estimators based on penalised spline regression is proposed. These estimators are weighted linear combinations of sample observations, with weights calibrated to known control totals. They allow straightforward extensions to multiple auxiliary variables and to complex designs. Under standard design conditions, the estimators are design consistent and asymptotically normal, and they admit consistent variance estimation using familiar design-based methods. Data-driven penalty selection is considered in the context of unequal probability sampling designs. Simulation experiments show that the estimators are more efficient than parametric regression estimators when the parametric model is incorrectly specified, while being approximately as efficient when the parametric specification is correct. An example using Forest Health Monitoring survey data from the U.S. Forest Service demonstrates the applicability of the methodology in the context of a two-phase survey with multiple auxiliary variables. Copyright 2005, Oxford University Press.

[1]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[2]  H. D. Patterson,et al.  Recovery of inter-block information when block sizes are unequal , 1971 .

[3]  Geoffrey Gregory,et al.  Foundations of Statistical Inference , 1973 .

[4]  C. Cassel,et al.  Some results on generalized difference estimation and generalized regression estimation for finite populations , 1976 .

[5]  T. Postelnicu,et al.  Foundations of inference in survey sampling , 1977 .

[6]  C. Särndal Implications of survey design for generalized regression estimation of linear functions , 1982 .

[7]  C. T. Isaki,et al.  Survey Design under the Regression Superpopulation Model , 1982 .

[8]  Anne Lohrli Chapman and Hall , 1985 .

[9]  C. Särndal,et al.  Calibration Estimators in Survey Sampling , 1992 .

[10]  Matthew P. Wand,et al.  Kernel Smoothing , 1995 .

[11]  Jiming Jiang REML estimation: asymptotic behavior and related topics , 1996 .

[12]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[13]  Robert Chambers,et al.  Robust case-weighting for multipurpose establishment surveys. , 1996 .

[14]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[15]  A. Gillespie,et al.  Rationale for a National Annual Forest Inventory Program , 1999 .

[16]  F. Breidt,et al.  Local polynomial regresssion estimators in survey sampling , 2000 .

[17]  A. Winsor Sampling techniques. , 2000, Nursing times.

[18]  David Ruppert,et al.  Theory & Methods: Spatially‐adaptive Penalties for Spline Fitting , 2000 .

[19]  R. Mountain Forest Health Monitoring in the Interior West , 2001 .

[20]  Gerda Claeskens,et al.  Some theory for penalized spline generalized additive models , 2002 .

[21]  A. Dorfman NON-PARAMETRIC REGRESSION FOR ESTIMATING TOTALS IN FINITE POPULATIONS , 2002 .

[22]  Matt P. Wand,et al.  Smoothing and mixed models , 2003, Comput. Stat..

[23]  R. Little,et al.  Penalized Spline Nonparametric Mixed Models for Inference About a Finite Population Mean from Two-Stage Samples , 2003 .

[24]  R. Little,et al.  Penalized Spline Model-Based Estimation of the Finite Populations Total from Probability-Proportional-to-Size Samples , 2003 .

[25]  Karl J. Friston,et al.  Variance Components , 2003 .

[26]  Changbao Wu The Efiective Use of Complete Auxiliary Information From Survey Data , 2004 .

[27]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .