论文信息 - Fractional ridge regression: a fast, interpretable reparameterization of ridge regression

Fractional ridge regression: a fast, interpretable reparameterization of ridge regression

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.

Kendrick Kay | Ariel Rokem | Kendrick Norris Kay | A. Rokem | Ariel S. Rokem

[1] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2] Jonathan Winawer,et al. Computational neuroimaging and population receptive fields , 2015, Trends in Cognitive Sciences.

[3] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[4] Mohammadreza Amirian,et al. Automated Machine Learning in Practice: State of the Art and Recent Results , 2019, 2019 6th Swiss Conference on Data Science (SDS).

[5] Marco F. Huber,et al. Benchmark and Survey of Automated Machine Learning Frameworks. , 2019 .

[6] K. Jarrod Millman,et al. Array programming with NumPy , 2020, Nat..

[7] Gilles Louppe,et al. Independent consultant , 2013 .

[8] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[9] Chandan Singh,et al. Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[10] M. Stone. Cross-validation:a review 2 , 1978 .

[11] Jonathan Winawer,et al. Computational Modeling of Responses in Human Visual Cortex , 2015 .

[12] Siu Kwan Lam,et al. Numba: a LLVM-based Python JIT compiler , 2015, LLVM '15.

[13] Arthur E. Hoerl,et al. Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[14] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[15] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16] Gaël Varoquaux,et al. The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[17] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[18] R. Tibshirani,et al. Efficient quadratic regularization for expression arrays. , 2004, Biostatistics.

[19] C. Goutis,et al. Estimation in linear models using gradient descent with early stopping , 1994 .

[20] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[21] Gene H. Golub,et al. Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[22] A. N. Tikhonov,et al. Solutions of ill-posed problems , 1977 .