Estimating disease prevalence from two-phase surveys with non-response at the second phase.

In this paper we compare several methods for estimating population disease prevalence from data collected by two-phase sampling when there is non-response at the second phase. The traditional weighting type estimator requires the missing completely at random assumption and may yield biased estimates if the assumption does not hold. We review two approaches and propose one new approach to adjust for non-response assuming that the non-response depends on a set of covariates collected at the first phase: an adjusted weighting type estimator using estimated response probability from a response model; a modelling type estimator using predicted disease probability from a disease model; and a regression type estimator combining the adjusted weighting type estimator and the modelling type estimator. These estimators are illustrated using data from an Alzheimer's disease study in two populations.

[1]  E L Korn,et al.  Predictive Margins with Survey Data , 1999, Biometrics.

[2]  J. N. K. Rao,et al.  Inference from Stratified Samples: Second-Order Analysis of Three Methods for Nonlinear Statistics , 1985 .

[3]  L. Beckett,et al.  Population prevalence estimates from complex samples. , 1992, Journal of clinical epidemiology.

[4]  L. Chambless,et al.  Maximum likelihood methods for complex sample data: logistic regression and discrete proportional hazards models , 1985 .

[5]  G. Dunn,et al.  Screening for stratification in two-phase ('two- stage') epidemiological surveys , 1995, Statistical methods in medical research.

[6]  J Lellouch,et al.  Estimating means and percentages in a complex sampling survey: application to a French national survey on sexual behaviour (ACSF). Analyse des Comportements Sexuels en France. , 1997, Statistics in medicine.

[7]  C. Särndal,et al.  A General View of Estimation for Two Phases of Selection with Applications to Two-Phase Sampling and Nonresponse , 1987 .

[8]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[9]  Chris J. Skinner,et al.  Analysis of complex surveys , 1991 .

[10]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[11]  Frederick W. Unverzagt,et al.  A CROSS‐CULTURAL COMMUNITY BASED STUDY OF DEMENTIAS: METHODS AND PERFORMANCE OF THE SURVEY INSTRUMENT, INDIANAPOLIS, U.S.A., AND IBADAN, NIGERIA , 1996 .

[12]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[13]  G. Roberts,et al.  Logistic regression analysis of sample survey data , 1987 .

[14]  H C Hendrie,et al.  Prevalence of Alzheimer's disease and dementia in two communities: Nigerian Africans and African Americans. , 1995, The American journal of psychiatry.