Semiparametric methods for evaluating the covariate‐specific predictiveness of continuous markers in matched case–control studies

To assess the value of a continuous marker in predicting the risk of a disease, a graphical tool called the predictiveness curve has been proposed. It characterizes the marker's predictiveness, or capacity to risk stratify the population by displaying the distribution of risk endowed by the marker. Methods for making inference about the curve and for comparing curves in a general population have been developed. However, knowledge about a marker's performance in the general population only is not enough. Since a marker's effect on the risk model and its distribution can both differ across subpopulations, its predictiveness may vary when applied to different subpopulations. Moreover, information about the predictiveness of a marker conditional on baseline covariates is valuable for individual decision making about having the marker measured or not. Therefore, to fully realize the usefulness of a risk prediction marker, it is important to study its performance conditional on covariates. In this article, we propose semiparametric methods for estimating covariate-specific predictiveness curves for a continuous marker. Unmatched and matched case-control study designs are accommodated. We illustrate application of the methodology by evaluating serum creatinine as a predictor of risk of renal artery stenosis.

[1]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[2]  Alastair Scott,et al.  Case–control studies with complex sampling , 2001 .

[3]  Ewout Steyerberg,et al.  A Clinical Prediction Rule for Renal Artery Stenosis , 1998, Annals of Internal Medicine.

[4]  Margaret Sullivan Pepe,et al.  Semiparametric methods for evaluating risk prediction markers in case-control studies. , 2009, Biometrika.

[5]  Charles F. Manski,et al.  Alternative Estimators and Sample Designs for Discrete Choice Analysis , 1981 .

[6]  N E Breslow,et al.  Logistic regression for stratified case-control studies. , 1988, Biometrics.

[7]  A. Owen Empirical Likelihood Ratio Confidence Regions , 1990 .

[8]  S Greenland,et al.  Analytic methods for two-stage case-control studies and other stratified designs. , 1991, Statistics in medicine.

[9]  T. Fears,et al.  Logistic regression methods for retrospective case-control studies using complex sampling procedures. , 1986, Biometrics.

[10]  Sudhir Srivastava,et al.  Markers for early detection of cancer: Statistical guidelines for nested case-control studies , 2002, BMC medical research methodology.

[11]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[12]  Norman E. Breslow,et al.  Maximum Likelihood Estimation of Logistic Regression Parameters under Two‐phase, Outcome‐dependent Sampling , 1997 .

[13]  A. Owen Empirical likelihood ratio confidence intervals for a single functional , 1988 .

[14]  M S Pepe,et al.  Phases of biomarker development for early detection of cancer. , 2001, Journal of the National Cancer Institute.

[15]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[16]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[17]  Nancy R Cook,et al.  The Effect of Including C-Reactive Protein in Cardiovascular Risk Prediction Models for Women , 2006, Annals of Internal Medicine.

[18]  A. Scott,et al.  Fitting regression models to case-control data by maximum likelihood , 1997 .

[19]  J. Lawless,et al.  Empirical Likelihood and General Estimating Equations , 1994 .

[20]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[21]  Yingye Zheng,et al.  Integrating the predictiveness of a marker with its performance as a classifier. , 2007, American journal of epidemiology.

[22]  T J Cole,et al.  Smoothing reference centile curves: the LMS method and penalized likelihood. , 1992, Statistics in medicine.

[23]  Ziding Feng,et al.  Evaluating the Predictiveness of a Continuous Marker , 2007, Biometrics.

[24]  Ewout W Steyerberg,et al.  A New Logistic Regression Approach for the Evaluation of Diagnostic Test Results , 2005, Medical decision making : an international journal of the Society for Medical Decision Making.

[25]  Holly Janes,et al.  Pivotal Evaluation of the Accuracy of a Biomarker Used for Classification or Prediction: Standards for Study Design , 2008, Journal of the National Cancer Institute.

[26]  N. Breslow,et al.  Statistics in Epidemiology : The Case-Control Study , 2008 .

[27]  Patrick J. Heagerty,et al.  Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children , 1999 .

[28]  N. Cook Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction , 2007, Circulation.

[29]  N E Breslow,et al.  Weighted likelihood, pseudo-likelihood and maximum likelihood methods for logistic regression analysis of two-stage data. , 1997, Statistics in medicine.

[30]  Charles F. Manski,et al.  Estimation of Response Probabilities From Augmented Retrospective Observations , 1985 .

[31]  Chris J. Wild,et al.  Fitting prospective regression models to case-control data , 1991 .