Borrowing information across populations in estimating positive and negative predictive values

A marker's capacity to predict risk of a disease depends on disease prevalence in the target population and its classification accuracy, i.e. its ability to discriminate diseased subjects from non-diseased subjects. The latter is often considered an intrinsic property of the marker; it is independent of disease prevalence and hence more likely to be similar across populations than risk prediction measures. In this paper, we are interested in evaluating the population-specific performance of a risk prediction marker in terms of positive predictive value (PPV) and negative predictive value (NPV) at given thresholds, when samples are available from the target population as well as from another population. A default strategy is to estimate PPV and NPV using samples from the target population only. However, when the marker's classification accuracy as characterized by a specific point on the receiver operating characteristics (ROC) curve is similar across populations, borrowing information across populations allows increased efficiency in estimating PPV and NPV. We develop estimators that optimally combine information across populations. We apply this methodology to a cross-sectional study where we evaluate PCA3 as a risk prediction marker for prostate cancer among subjects with or without previous negative biopsy.

[1]  M. Pepe,et al.  Comparisons of Predictive Values of Binary Medical Diagnostic Tests for Paired Designs , 2000, Biometrics.

[2]  M S Pepe,et al.  Phases of biomarker development for early detection of cancer. , 2001, Journal of the National Cancer Institute.

[3]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[4]  Margaret Sullivan Pepe,et al.  The Analysis of Placement Values for Evaluating Discriminatory Measures , 2004, Biometrics.

[5]  Chaya S Moskowitz,et al.  Quantifying and comparing the predictive accuracy of continuous prognostic factors for binary outcomes. , 2004, Biostatistics.

[6]  Seongjoon Koo,et al.  PCA3: a molecular urine assay for predicting prostate biopsy outcome. , 2008, The Journal of urology.

[7]  Holly Janes,et al.  Practice of Epidemiology Adjusting for Covariates in Studies of Diagnostic, Screening, or Prognostic Markers: an Old Concept in a New Setting , 2022 .

[8]  M. Pepe,et al.  Comparing the predictive values of diagnostic tests: sample size and analysis for paired study designs , 2006, Clinical trials.

[9]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[10]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[11]  D. Quade Using Weighted Rankings in the Analysis of Complete Blocks with Additive Block Effects , 1979 .

[12]  Holly Janes,et al.  Adjusting for Covariate Effects on Classification Accuracy Using the Covariate-Adjusted ROC Curve , 2006 .

[13]  Jason Fine,et al.  Sample size for positive and negative predictive value in diagnostic research using case-control designs. , 2008, Biostatistics.