Saturating Splines and Feature Selection Nicholas Boyd Trevor Hastie Stephen Boyd Benjamin Recht

We extend the adaptive regression spline model by incorporating saturation, the natural requirement that a function extend as a constant outside a certain range. We fit saturating splines to data using a convex optimization problem over a space of measures, which we solve using an efficient algorithm based on the conditional gradient method. Unlike many existing approaches, our algorithm solves the original infinite-dimensional (for splines of degree at least two) optimization problem without pre-specified knot locations. We then adapt our algorithm to fit generalized additive models with saturating splines as coordinate functions and show that the saturation requirement allows our model to simultaneously perform feature selection and nonlinear function fitting. Finally, we briefly sketch how the method can be extended to higher order splines and to different requirements on the extension outside the data range.

[1]  Reijo Sund,et al.  Computer Age Statistical Inference: Algorithms, Evidence, and Data ScienceBradleyEfron and TrevorHastie Institute of Mathematical Statistics Monographs Cambridge University Press, 2016, (8th printing 2018), xix + 475 pages, $74.99, hardcover ISBN: 978‐1‐107‐14989‐2 , 2019, International Statistical Review.

[2]  Ashley Petersen,et al.  Fused Lasso Additive Model , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[3]  Sophia Decker,et al.  Approximate Methods In Optimization Problems , 2016 .

[4]  Benjamin Recht,et al.  The alternating descent conditional gradient method for sparse inverse problems , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[5]  T. Hastie,et al.  Generalized Additive Model Selection , 2015, 1506.03850.

[6]  Ryan J. Tibshirani,et al.  Fast and Flexible ADMM Algorithms for Trend Filtering , 2014, ArXiv.

[7]  R. Tibshirani Adaptive piecewise polynomial estimation via trend filtering , 2013, 1304.2986.

[8]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[9]  Martin Jaggi,et al.  Approximating parameterized convex optimization problems , 2010, TALG.

[10]  Michael P. Friedlander,et al.  Sparse Optimization with Least-Squares Constraints , 2011, SIAM J. Optim..

[11]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[12]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[13]  Stephen P. Boyd,et al.  1 Trend Filtering , 2009, SIAM Rev..

[14]  Nathan Srebro,et al.  ` 1 Regularization in Infinite Dimensional Feature Spaces , 2007 .

[15]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[16]  S. Geer,et al.  Locally adaptive regression splines , 1997 .

[17]  Michel Barlaud,et al.  Deterministic edge-preserving regularization in computed imaging , 1997, IEEE Trans. Image Process..

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[19]  D. Cox Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[20]  Steven A. Orszag,et al.  CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[21]  J. Dunn,et al.  Conditional gradient algorithms with open loop step size rules , 1978 .

[22]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.