A comparison between the design-based and model-based approaches using longitudinal survey data

Survey data analysis using complex sampling designs ought to account for clustering, stratification and unequal probability of selection. Design-based and model-based methods are two commonly used routes taken to account for such survey designs. Several studies of cross-sectional survey designs have shown that these two approaches provide similar results when the model fits the data well. The present paper aims at comparing these two approaches for longitudinal survey design using the National Population Health Survey (NPHS) dataset. A marginal modeling approach proposed by Rao and modified bootstrap method for longitudinal data were used by way of a design-based method. The Generalized Estimating Equation (GEE) method, proposed by Liang and Zeger was used as a typical model-based approach. The parameter estimates obtained using the design-based and model-based methods were similar. However, the standard errors and the 95% confidence interval were different. Rao's method produced the most conservative standard errors. In conclusion, design-based methods should be preferred over model-based methods, as this method provides reliable results.

[1]  Jerome P. Reiter,et al.  Analytical Modeling in Complex Surveys of Work Practices , 2005 .

[2]  Phillip S. Kott,et al.  A Model-Based Look at Linear Regression with Survey Data , 1991 .

[3]  D. Pfeffermann The Role of Sampling Weights when Modeling Survey Data , 1993 .

[4]  J. Lawless Event History Analysis and Longitudinal Surveys , 2003 .

[5]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[6]  E L Korn,et al.  Modelling the sampling design in the analysis of health surveys , 1996, Statistical methods in medical research.

[7]  Chris J. Skinner,et al.  Random effects models for longitudinal survey data , 2003 .

[8]  G G Koch,et al.  Applying sample survey methods to clinical trials data , 2001, Statistics in medicine.

[9]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[10]  Danny Pfeffermann,et al.  Multilevel modelling of complex survey longitudinal data with time varying random effects , 2000 .

[11]  D. Binder On the variances of asymptotically normal estimators from complex surveys , 1983 .

[12]  P. McCullagh Quasi-Likelihood Functions , 1983 .

[13]  D. Pfeffermann,et al.  The use of sampling weights for survey data analysis , 1996, Statistical methods in medical research.

[14]  G. Roberts,et al.  HOW IMPORTANT IS THE INFORMATIVENESS OF THE SAMPLE DESIGN , 2005 .

[15]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[16]  Edward L. Korn,et al.  Examples of Differing Weighted and Unweighted Estimates from a Sample Survey , 1995 .

[17]  Carl-Erik Särndal,et al.  Model Assisted Survey Sampling , 1997 .

[18]  Graham Kalton,et al.  Models in the Practice of Survey Sampling , 1983 .

[19]  N. Laird,et al.  A likelihood-based method for analysing longitudinal binary responses , 1993 .

[20]  Sunita Ghosh,et al.  Comparison of design-based and model-based methods to estimate the variance using National Population Health Survey data , 2008, Model. Assist. Stat. Appl..

[21]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[22]  Risto Lehtonen,et al.  Practical Methods for Design and Analysis of Complex Surveys , 1995 .