Lecture notes on ridge regression

The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to constrained estimation, and its relation to Bayesian regression. Finally, its behaviour and use are illustrated in simulation and on omics data. Subsequently, ridge regression is generalized to allow for a more general penalty. Finally, the framework is translated to logistic regression and its properties are shown to carry over.

[1]  Jin-Wu Nam,et al.  Genomics of microRNA. , 2006, Trends in genetics : TIG.

[2]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[3]  Jelle J Goeman,et al.  Autocorrelated Logistic Ridge Regression for Prediction Based on Proteomics Spectra , 2008, Statistical applications in genetics and molecular biology.

[4]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[5]  C. R. Henderson ESTIMATION OF VARIANCE AND COVARIANCE COMPONENTS , 1953 .

[6]  R. Stephens,et al.  Genomic profiling of microRNA and messenger RNA reveals deregulated microRNA expression in prostate cancer. , 2008, Cancer research.

[7]  Harald Binder,et al.  Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures , 2014, PloS one.

[8]  F. Slack,et al.  Oncomirs — microRNAs with a role in cancer , 2006, Nature Reviews Cancer.

[9]  June Luo Asymptotic efficiency of ridge estimator in linear and semiparametric linear models , 2012 .

[10]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[11]  G. Wahba,et al.  A NOTE ON THE LASSO AND RELATED PROCEDURES IN MODEL SELECTION , 2006 .

[12]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[13]  Can Yang,et al.  On high-dimensional misspecified mixed model analysis in genome-wide association study , 2016 .

[14]  N. Draper,et al.  Applied Regression Analysis. , 1967 .

[15]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[16]  R. Schaefer,et al.  A ridge logistic estimator , 1984 .

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[19]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[20]  Jelle J Goeman,et al.  Efficient approximate k‐fold and leave‐one‐out cross‐validation for ridge regression , 2013, Biometrical journal. Biometrische Zeitschrift.

[21]  Xinwei Deng,et al.  Estimation in high-dimensional linear models with deterministic design matrices , 2012, 1206.0847.

[22]  H. Akaike A new look at the statistical model identification , 1974 .

[23]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[24]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[25]  Shein-Chung Chow,et al.  Variable screening in predicting clinical outcome with high-dimensional microarrays , 2007 .

[26]  S. Kornbluth,et al.  Negative Regulation of DNA Replication by the Retinoblastoma Protein Is Mediated by Its Association with MCM7 , 1998, Molecular and Cellular Biology.

[27]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[28]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[29]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[30]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[31]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[32]  Richard Simon,et al.  Gene expression-based prognostic signatures in lung cancer: ready for clinical use? , 2010, Journal of the National Cancer Institute.

[33]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[34]  P. Callas,et al.  DNA replication regulation protein Mcm7 as a marker of proliferation in prostate cancer , 2004, Journal of Clinical Pathology.

[35]  R. W. Farebrother,et al.  Further Results on the Mean Square Error of Ridge Regression , 1976 .

[36]  R. Tibshirani,et al.  Efficient quadratic regularization for expression arrays. , 2004, Biostatistics.

[37]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[38]  J. F. Lawless,et al.  Mean Squared Error Properties of Generalized Ridge Estimators , 1981 .

[39]  Bruce E. Hansen,et al.  The Risk of James–Stein and Lasso Shrinkage , 2016 .

[40]  Roger Fletcher,et al.  Practical methods of optimization; (2nd ed.) , 1987 .

[41]  J. N. R. Jeffers,et al.  Graphical Models in Applied Multivariate Statistics. , 1990 .

[42]  J. Goeman L1 Penalized Estimation in the Cox Proportional Hazards Model , 2009, Biometrical journal. Biometrische Zeitschrift.

[43]  Bernard D. Flury Acceptance-Rejection Sampling Made Easy , 1990, SIAM Rev..

[44]  S. Rosset,et al.  Piecewise linear regularized solution paths , 2007, 0708.2197.

[45]  C. Theobald Generalizations of Mean Square Error Applied to Ridge Regression , 1974 .

[46]  Charles J. Geyer,et al.  Markov Chain Monte Carlo Lecture Notes , 2005 .

[47]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[48]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[49]  J. Cerhan,et al.  Gene networks and microRNAs implicated in aggressive prostate cancer. , 2009, Cancer research.

[50]  W. Hemmerle An Explicit Solution for Generalized Ridge Regression , 1975 .

[51]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[52]  A. M. Mathai,et al.  Quadratic forms in random variables : theory and applications , 1992 .

[53]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2001, Springer Series in Statistics.

[54]  B. Tye MCM proteins in DNA replication. , 1999, Annual review of biochemistry.

[55]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[56]  K. Liestøl,et al.  Flotillins as regulators of ErbB2 levels in breast cancer , 2013, Oncogene.

[57]  June Luo The discovery of mean square error consistency of a ridge estimator , 2010 .

[58]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .

[59]  Sylvain Sardy,et al.  On the Practice of Rescaling Covariates , 2008 .

[60]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..