Saturating Splines and Feature Selection

We extend the adaptive regression spline model by incorporating saturation, the natural requirement that a function extend as a constant outside a certain range. We fit saturating splines to data via a convex optimization problem over a space of measures, which we solve using an efficient algorithm based on the conditional gradient method. Unlike many existing approaches, our algorithm solves the original infinite-dimensional (for splines of degree at least two) optimization problem without pre-specified knot locations. We then adapt our algorithm to fit generalized additive models with saturating splines as coordinate functions and show that the saturation requirement allows our model to simultaneously perform feature selection and nonlinear function fitting. Finally, we briefly sketch how the method can be extended to higher order splines and to different requirements on the extension outside the data range.

[1]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[2]  J. Dunn,et al.  Conditional gradient algorithms with open loop step size rules , 1978 .

[3]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[4]  G. Wahba Spline models for observational data , 1990 .

[5]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[6]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[7]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Michel Barlaud,et al.  Deterministic edge-preserving regularization in computed imaging , 1997, IEEE Trans. Image Process..

[10]  S. Geer,et al.  Locally adaptive regression splines , 1997 .

[11]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[12]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[13]  Nathan Srebro,et al.  ` 1 Regularization in Infinite Dimensional Feature Spaces , 2007 .

[14]  Stephen P. Boyd,et al.  1 Trend Filtering , 2009, SIAM Rev..

[15]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[16]  Martin Jaggi,et al.  Approximating Parameterized Convex Optimization Problems , 2010, ESA.

[17]  Michael P. Friedlander,et al.  Sparse Optimization with Least-Squares Constraints , 2011, SIAM J. Optim..

[18]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[19]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[20]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[21]  R. Tibshirani Adaptive piecewise polynomial estimation via trend filtering , 2013, 1304.2986.

[22]  Ryan J. Tibshirani,et al.  Fast and Flexible ADMM Algorithms for Trend Filtering , 2014, ArXiv.

[23]  T. Hastie,et al.  Generalized Additive Model Selection , 2015, 1506.03850.

[24]  Benjamin Recht,et al.  The alternating descent conditional gradient method for sparse inverse problems , 2015, CAMSAP.

[25]  Ashley Petersen,et al.  Fused Lasso Additive Model , 2014, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[26]  Sophia Decker,et al.  Approximate Methods In Optimization Problems , 2016 .

[27]  Francis R. Bach,et al.  Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..