Fridge: Focused fine‐tuning of ridge regression for personalized predictions

Statistical prediction methods typically require some form of fine-tuning of tuning parameter(s), with K-fold cross-validation as the canonical procedure. For ridge regression, there exist numerous procedures, but common for all, including cross-validation, is that one single parameter is chosen for all future predictions. We propose instead to calculate a unique tuning parameter for each individual for which we wish to predict an outcome. This generates an individualized prediction by focusing on the vector of covariates of a specific individual. The focused ridge-fridge-procedure is introduced with a 2-part contribution: First we define an oracle tuning parameter minimizing the mean squared prediction error of a specific covariate vector, and then we propose to estimate this tuning parameter by using plug-in estimates of the regression coefficients and error variance parameter. The procedure is extended to logistic ridge regression by using parametric bootstrap. For high-dimensional data, we propose to use ridge regression with cross-validation as the plug-in estimate, and simulations show that fridge gives smaller average prediction error than ridge with cross-validation for both simulated and real data. We illustrate the new concept for both linear and logistic regression models in 2 applications of personalized medicine: predicting individual risk and treatment response based on gene expression data. The method is implemented in the R package fridge.

[1]  Bhramar Mukherjee,et al.  A Small-Sample Choice of the Tuning Parameter in Ridge Regression. , 2015, Statistica Sinica.

[2]  R. Spang,et al.  Response-Predictive Gene Expression Profiling of Glioma Progenitor Cells In Vitro , 2014, PloS one.

[3]  Wessel N van Wieringen,et al.  Better prediction by use of co‐data: adaptive group‐regularized ridge regression , 2014, Statistics in medicine.

[4]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[5]  Wei Pan,et al.  Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms , 2007, Bioinform..

[6]  Nils Lid Hjort,et al.  Model Selection and Model Averaging: Contents , 2008 .

[7]  J. F. Lawless,et al.  Mean Squared Error Properties of Generalized Ridge Estimators , 1981 .

[8]  Lee H. Dicker,et al.  Variance estimation in high-dimensional linear models , 2014 .

[9]  I. Glad,et al.  Weighted Lasso with Data Integration , 2011, Statistical applications in genetics and molecular biology.

[10]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[11]  Jelle J Goeman,et al.  Efficient approximate k‐fold and leave‐one‐out cross‐validation for ridge regression , 2013, Biometrical journal. Biometrische Zeitschrift.

[12]  S. Chatterjee,et al.  Use of the Bootstrap and Cross-validation in Ridge Regression , 1986 .

[13]  M. G. Patel,et al.  The effect of dietary intervention on weight gains after renal transplantation. , 1998, Journal of renal nutrition : the official journal of the Council on Renal Nutrition of the National Kidney Foundation.

[14]  Hilde Galleberg Johnsen Regularized parameter estimation and the choice of tuning parameters , 2011 .

[15]  A. Kirk,et al.  Obesity following kidney transplantation and steroid avoidance immunosuppression , 2008, Clinical transplantation.

[16]  F. Collins,et al.  The path to personalized medicine. , 2010, The New England journal of medicine.

[17]  Ramin Homayouni,et al.  Expression Levels of Obesity-Related Genes Are Associated with Weight Change in Kidney Transplant Recipients , 2013, PloS one.

[18]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[19]  R. Stupp,et al.  Individualized Targeted Therapy for Glioblastoma: Fact or Fiction? , 2012, Cancer journal.

[20]  Minh Ngoc Tran,et al.  Penalized Maximum Likelihood Principle for Choosing Ridge Parameter , 2009, Commun. Stat. Simul. Comput..

[21]  Aris Perperoglou,et al.  The Weight of Penalty Optimization for Ridge Regression , 2014, ECDA.

[22]  W. Hemmerle An Explicit Solution for Generalized Ridge Regression , 1975 .

[23]  Arnoldo Frigessi,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm305 Gene expression Predicting survival from microarray data—a comparative study , 2022 .