Weighting methods for population‐based case–control studies with complex sampling

Summary.  Complex sample designs, involving stratified and/or multistage sampling with sample weighting, along with frequency matching, are used to select controls or cases for case–control studies. Examples that motivated this paper are the Kaposi sarcoma case–control study that was conducted in Sicily and the US kidney cancer case–control study. Survey design-based approaches can be inefficient for the analysis of case–control studies with frequency matching. We propose a weighting method that post-stratifies control sample weights to the estimated population distribution of the matching variables among cases. This weighting maintains the efficiency of frequency matching. The method proposed is evaluated by using simulation studies and is applied to the two case–control studies.

[1]  D. Binder On the variances of asymptotically normal estimators from complex surveys , 1983 .

[2]  E. Korn,et al.  Analysis of Health Surveys: Korn/Analysis , 1999 .

[3]  A. Scott,et al.  Fitting Logistic Regression Models in Stratified Case-Control Studies , 1991 .

[4]  A. Scott,et al.  Fitting Logistic Models Under Case‐Control or Choice Based Sampling , 1986 .

[5]  Andrea Piesse,et al.  Survey research methods in evaluation and case–control studies , 2007, Statistics in medicine.

[6]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[7]  T. Fears,et al.  Logistic regression methods for retrospective case-control studies using complex sampling procedures. , 1986, Biometrics.

[8]  J. Anderson Separate sample logistic discrimination , 1972 .

[9]  K F Rust,et al.  Variance estimation for complex surveys using replication techniques , 1996, Statistical methods in medical research.

[10]  D. Brogan,et al.  Comparison of telephone sampling and area sampling: response rates and within-household coverage. , 2001, American journal of epidemiology.

[11]  A. Scott,et al.  Fitting regression models to case-control data by maximum likelihood , 1997 .

[12]  Ralph DiGaetano,et al.  Commentary: Trade-offs in the development of a sample design for case-control studies. , 2002, American journal of epidemiology.

[13]  Richard F Potthoff,et al.  Flexible Frames and Control Sampling in Case-Control Studies , 2008, The American statistician.

[14]  N E Breslow,et al.  Logistic regression for stratified case-control studies. , 1988, Biometrics.

[15]  Alastair Scott,et al.  Case–control studies with complex sampling , 2001 .

[16]  A. Scott,et al.  On the robustness of weighted methods for fitting models to case–control data , 2002 .

[17]  M. Gail,et al.  Analysis of a Two‐Stage Case–Control Study with Cluster Sampling of Controls: Application to Nonmelanoma Skin Cancer , 2000, Biometrics.

[18]  N Breslow,et al.  Are there two logistic regressions for retrospective studies? , 1978, Biometrics.

[19]  J. Goedert,et al.  Risk of classic Kaposi sarcoma with residential exposure to volcanic and related soils in Sicily. , 2009, Annals of epidemiology.

[20]  Norman E. Breslow,et al.  Maximum Likelihood Estimation of Logistic Regression Parameters under Two‐phase, Outcome‐dependent Sampling , 1997 .

[21]  J. Goedert,et al.  Risk Factors for Classical Kaposi Sarcoma in a Population-based Case-control Study in Sicily , 2008, Cancer Epidemiology Biomarkers & Prevention.