Model‐free scoring system for risk prediction with application to hepatocellular carcinoma study

There is an increasing need to construct a risk-prediction scoring system for survival data and identify important risk factors (e.g., biomarkers) for patient screening and treatment recommendation. However, most existing methodologies either rely on strong model assumptions (e.g., proportional hazards) or only handle binary outcomes. In this article, we propose a flexible method that simultaneously selects important risk factors and identifies the optimal linear combination of risk factors by maximizing a pseudo-likelihood function based on the time-dependent area under the receiver operating characteristic curve. Our method is particularly useful for risk evaluation and recommendation of optimal subsequent treatments. We show that the proposed method has desirable theoretical properties, including asymptotic normality and the oracle property after variable selection. Numerical performance is evaluated on several simulation data sets and an application to hepatocellular carcinoma data.

[1]  R. Fontana,et al.  The prevalence and risk factors associated with esophageal varices in subjects with hepatitis C and advanced fibrosis. , 2006, Gastrointestinal endoscopy.

[2]  David Sidransky,et al.  Emerging molecular markers of cancer , 2002, Nature Reviews Cancer.

[3]  Xiwei Chen,et al.  Empirical likelihood ratio confidence interval estimation of best linear combinations of biomarkers , 2015, Comput. Stat. Data Anal..

[4]  Enrique F Schisterman,et al.  On linear combinations of biomarkers to improve diagnostic accuracy , 2005, Statistics in medicine.

[5]  Charles Kooperberg,et al.  Combining biomarkers to detect disease with application to prostate cancer. , 2003, Biostatistics.

[6]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[7]  H. Ding,et al.  Diagnostic value of protein induced by vitamin K absence (PIVKAII) and hepatoma-specific band of serum gamma-glutamyl transferase (GGTII) as hepatocellular carcinoma markers complementary to α-fetoprotein , 2003, British Journal of Cancer.

[8]  Jian Huang,et al.  Regularized ROC method for disease classification and biomarker selection with microarray data , 2005, Bioinform..

[9]  M. C. Jones The performance of kernel density functions in kernel distribution function estimation , 1990 .

[10]  P. Song,et al.  Serum biomarkers for early diagnosis of hepatocellular carcinoma , 2014 .

[11]  J. Everhart,et al.  Burden of digestive diseases in the United States part I: overall and upper gastrointestinal diseases. , 2009, Gastroenterology.

[12]  L. Lawrence Incidence of Hepatocellular Carcinoma May Decline , 2016 .

[13]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[14]  Xiwei Chen,et al.  Statistical Testing Strategies in the Health Sciences , 2016 .

[15]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[16]  Debashis Ghosh,et al.  Combining Multiple Biomarker Models in Logistic Regression , 2008, Biometrics.

[17]  Tianxi Cai,et al.  Resampling Procedures for Making Inference Under Nested Case–Control Studies , 2013, Journal of the American Statistical Association.

[18]  Pengfei Li,et al.  Using a Monotonic Density Ratio Model to Find the Asymptotically Optimal Combination of Multiple Diagnostic Tests , 2016 .

[19]  Ying Yuan,et al.  A direct method to evaluate the time‐dependent predictive accuracy for biomarkers , 2015, Biometrics.

[20]  William M. Lee,et al.  Des-gamma-carboxy prothrombin and alpha-fetoprotein as biomarkers for the early detection of hepatocellular carcinoma. , 2010, Gastroenterology.

[21]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[22]  Jianjun Gao,et al.  Biomarkers: Evaluation of Screening for and Early Diagnosis of Hepatocellular Carcinoma in Japan and China , 2013, Liver Cancer.

[23]  Tuo Zhao,et al.  Pathwise Coordinate Optimization for Sparse Learning: Algorithm and Theory , 2014, ArXiv.

[24]  T. Cai,et al.  Combining Predictors for Classification Using the Area under the Receiver Operating Characteristic Curve , 2006, Biometrics.

[25]  J. Everhart,et al.  Burden of digestive diseases in the United States Part III: Liver, biliary tract, and pancreas. , 2009, Gastroenterology.

[26]  A. Burroughs,et al.  A simple prognostic scoring system for patients receiving transarterial embolisation for hepatocellular cancer , 2013, Annals of oncology : official journal of the European Society for Medical Oncology.

[27]  Margaret Sullivan Pepe,et al.  Combining Several Screening Tests: Optimality of the Risk Score , 2002, Biometrics.

[28]  D. Harnois Incidence of Hepatocellular Carcinoma and Associated Risk Factors in Hepatitis C-Related Advanced Liver Disease , 2009 .

[29]  X. Wang,et al.  Evaluation of Midkine as a Diagnostic Serum Biomarker in Hepatocellular Carcinoma , 2013, Clinical Cancer Research.

[30]  Luca Malcovati,et al.  Revised international prognostic scoring system for myelodysplastic syndromes. , 2012, Blood.

[31]  Jun S. Liu,et al.  Linear Combinations of Multiple Diagnostic Markers , 1993 .

[32]  Y. Chang Maximizing an ROC‐type measure via linear combination of markers when the gold reference is continuous , 2013, Statistics in medicine.

[33]  Matthias Schmid,et al.  Boosting the Concordance Index for Survival Data – A Unified Framework To Derive and Evaluate Biomarker Combinations , 2013, PloS one.

[34]  Tianxi Cai,et al.  Application of the Time‐Dependent ROC Curves for Prognostic Accuracy with Multiple Biomarkers , 2006, Biometrics.

[35]  D. Harnois,et al.  Aging of Hepatitis C Virus (HCV)-Infected Persons in the United States: A Multiple Cohort Model of HCV Prevalence and Disease Progression , 2010 .

[36]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[37]  M. Zoli,et al.  Semiannual surveillance is superior to annual surveillance for the detection of early hepatocellular carcinoma and patient survival. , 2010, Journal of hepatology.

[38]  J. Wu,et al.  Prognostic scoring systems in patients with follicular thyroid cancer: a comparison of different staging systems in predicting the patient outcome. , 2004, Thyroid : official journal of the American Thyroid Association.

[39]  M. Alter,et al.  Aging of hepatitis C virus (HCV)-infected persons in the United States: a multiple cohort model of HCV prevalence and disease progression. , 2010, Gastroenterology.

[40]  T Hamblin,et al.  International scoring system for evaluating prognosis in myelodysplastic syndromes. , 1997, Blood.

[41]  M. Pepe,et al.  Combining diagnostic test results to increase accuracy. , 2000, Biostatistics.

[42]  P. Heagerty,et al.  Survival Model Predictive Accuracy and ROC Curves , 2005, Biometrics.

[43]  Donglin Zeng,et al.  Efficient Estimation for the Accelerated Failure Time Model , 2007 .