Weight calibration to improve the efficiency of pure risk estimates from case‐control samples nested in a cohort

Cohort studies provide information on relative hazards and pure risks of disease. For rare outcomes, large cohorts are needed to have sufficient numbers of events, making it costly to obtain covariate information on all cohort members. We focus on nested case‐control designs that are used to estimate relative hazard in the Cox regression model. In 1997, Langholz and Borgan showed that pure risk can also be estimated from nested case‐control data. However, these approaches do not take advantage of some covariates that may be available on all cohort members. Researchers have used weight calibration to increase the efficiency of relative hazard estimates from case‐cohort studies and nested cased‐control studies. Our objective is to extend weight calibration approaches to nested case‐control designs to improve precision of estimates of relative hazards and pure risks. We show that calibrating sample weights additionally against follow‐up times multiplied by relative hazards during the risk projection period improves estimates of pure risk. Efficiency improvements for relative hazards for variables that are available on the entire cohort also contribute to improved efficiency for pure risks. We develop explicit variance formulas for the weight‐calibrated estimates. Simulations show how much precision is improved by calibration and confirm the validity of inference based on asymptotic normality. Examples are provided using data from the American Association of Retired Persons Diet and Health Cohort Study.

[1]  N. Breslow,et al.  Survival Analysis of Case-Control Data: A Sample Survey Approach , 2018, Handbook of Statistical Methods for Case-Control Studies.

[2]  N. Breslow,et al.  Inverse Probability Weighting in Nested Case-Control Studies , 2018, Handbook of Statistical Methods for Case-Control Studies.

[3]  Nilanjan Chatterjee,et al.  Handbook of Statistical Methods for Case-Control Studies , 2018 .

[4]  Angela M Wood,et al.  Multiple Imputation of Missing Data in Nested Case-Control and Case-Cohort Studies , 2018, Biometrics.

[5]  Mitchell H. Gail,et al.  Absolute Risk: Methods and Applications in Clinical Management and Public Health , 2018 .

[6]  Jean D. Opsomer,et al.  Model-Assisted Survey Estimation with Modern Prediction Techniques , 2017 .

[7]  C L Rivera,et al.  Using the entire history in the analysis of nested case cohort samples , 2016, Statistics in medicine.

[8]  Donglin Zeng,et al.  Efficient Estimation of Semiparametric Transformation Models for Two-Phase Cohort Studies , 2014, Journal of the American Statistical Association.

[9]  M. Genton,et al.  On the robustness of two-stage estimators , 2012 .

[10]  Nathalie C. Støer,et al.  Comparison of estimators in nested case–control studies with multiple outcomes , 2012, Lifetime Data Analysis.

[11]  Pamela A Shaw,et al.  Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data , 2011, International statistical review = Revue internationale de statistique.

[12]  Thomas Lumley,et al.  Using the whole cohort in the analysis of case-cohort data. , 2009, American journal of epidemiology.

[13]  Olli Saarela,et al.  Nested case–control data utilized for multiple outcomes: a likelihood approach and alternatives , 2008, Statistics in medicine.

[14]  Thomas H Scheike,et al.  Maximum likelihood estimation for Cox's regression model under nested case-control sampling. , 2004, Biostatistics.

[15]  A F Subar,et al.  Design and serendipity in establishing a large cohort with wide dietary intake distributions : the National Institutes of Health-American Association of Retired Persons Diet and Health Study. , 2001, American journal of epidemiology.

[16]  Changbao Wu,et al.  A Model-Calibration Approach to Using Complete Auxiliary Information From Survey Data , 2001 .

[17]  Sven Ove Samuelsen,et al.  A psudolikelihood approach to analysis of nested case-control studies , 1997 .

[18]  B Langholz,et al.  Estimation of absolute risk from nested case-control data. , 1997, Biometrics.

[19]  C. Särndal,et al.  Calibration Estimators in Survey Sampling , 1992 .

[20]  David A. Binder,et al.  Fitting Cox's proportional hazards models from survey data , 1992 .

[21]  Nancy Reid,et al.  Influence functions for proportional hazards regression , 1985 .

[22]  Norman E. Breslow,et al.  Multiplicative Models and Cohort Analysis , 1983 .

[23]  D. Oakes,et al.  Survival Times: Aspects of Partial Likelihood , 1981 .

[24]  R. L. Prentice,et al.  Retrospective studies and failure time models , 1978 .

[25]  F. D. K. Liddell,et al.  Methods of Cohort Analysis : Appraisal by Application to Asbestos Mining , 1977 .

[26]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[27]  D.,et al.  Regression Models and Life-Tables , 2022 .