Flexible smoothing with B-splines and penalties

B-splines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number allows only limited control over smoothness and fit. We propose to use a relatively large number of knots and a difference penalty on coefficients of adjacent B-splines. We show connections to the familiar spline penalty on the integral of the squared second derivative. A short overview of $B$-splines, of their construction and of penalized likelihood is presented. We discuss properties of penalized B-splines and propose various criteria for the choice of an optimal penalty parameter. Nonparametric logistic regression, density estimation and scatterplot smoothing are used as examples. Some details of the computations are presented.

[1]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[2]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[3]  Peter Hall,et al.  On local smoothing of nonparametric curve estimators , 1996 .

[4]  C. J. Stone,et al.  A study of logspline density estimation , 1991 .

[5]  J. Marron,et al.  Equivalence of Smoothing Parameter Selectors in Density and Intensity Estimation , 1988 .

[6]  M. Carter Computer graphics: Principles and practice , 1997 .

[7]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[8]  F. O’Sullivan Fast Computation of Fully Automated Log-Density and Log-Hazard Estimators , 1988 .

[9]  Jianqing Fan,et al.  Data‐Driven Bandwidth Selection in Local Polynomial Fitting: Variable Bandwidth and Spatial Adaptation , 1995 .

[10]  Luc Devroye,et al.  Nonparametric Density Estimation , 1985 .

[11]  David J. Hand,et al.  A Handbook of Small Data Sets , 1993 .

[12]  G. Kitagawa,et al.  Akaike Information Criterion Statistics , 1988 .

[13]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[14]  G. Wahba Spline models for observational data , 1990 .

[15]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[16]  Trevor Hastie,et al.  Polynomial splines and their tensor products in extended linear modeling. Discussion and rejoinder , 1997 .

[17]  Finbarr O'Sullivan,et al.  [A Statistical Perspective on Ill-Posed Inverse Problems]: Rejoinder , 1986 .

[18]  W. Härdle Applied Nonparametric Regression , 1991 .

[19]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[20]  A. Kneip Ordered Linear Smoothers , 1994 .

[21]  Paul H. C. Eilers,et al.  Penalized regression in action: Estimating pollution roses from daily averages , 1991 .

[22]  Paul H. C. Eilers,et al.  Generalized Linear Models with P-splines , 1992 .

[23]  Edmund Taylor Whittaker On a New Method of Graduation , 1922, Proceedings of the Edinburgh Mathematical Society.

[24]  Trevor Hastie,et al.  [Flexible Parsimonious Smoothing and Additive Modeling]: Discussion , 1989 .

[25]  C. Reinsch Smoothing by spline functions , 1967 .

[26]  P. Eilers,et al.  Nonparametric density estimation with grouped observations , 1991 .

[27]  I. Good,et al.  Density Estimation and Bump-Hunting by the Penalized Likelihood Method Exemplified by Scattering and Meteorite Data , 1980 .

[28]  Paul H. C. Eilers,et al.  Direct generalized additive modeling with penalized likelihood , 1998 .

[29]  Jerome H. Friedman Multivariate adaptive regression splines (with discussion) , 1991 .

[30]  B. Yandell,et al.  Semi-Parametric Generalized Linear Models. , 1985 .

[31]  Paul H. C. Eilers,et al.  Indirect Observations, Composite Link Models and Penalized Likelihood , 1995 .

[32]  B. Yandell Spline smoothing and nonparametric regression , 1989 .

[33]  Paul Dierckx,et al.  Curve and surface fitting with splines , 1994, Monographs on numerical analysis.

[34]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[35]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[36]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[37]  J R Ashford,et al.  Quantal response analysis for a mixture of populations. , 1972, Biometrics.

[38]  M. C. Jones,et al.  Spline Smoothing and Nonparametric Regression. , 1989 .

[39]  G. Wahba Improper Priors, Spline Smoothing and the Problem of Guarding Against Model Errors in Regression , 1978 .

[40]  W. Härdle Applied Nonparametric Regression , 1992 .

[41]  P. H. C. Eilers Autoregressive Models with Latent Variables , 1988 .

[42]  Young K. Truong,et al.  LOGSPLINE ESTIMATION OF A POSSIBLY MIXED SPECTRAL DISTRIBUTION , 1995 .

[43]  J. Marron,et al.  Transformations to reduce boundary bias in kernel density estimation , 1994 .

[44]  Jianqing Fan Local Linear Regression Smoothers and Their Minimax Efficiencies , 1993 .

[45]  C. J. Stone,et al.  Logspline Density Estimation for Censored Data , 1992 .

[46]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[47]  C. D. Boor,et al.  Package for calculating B-splines , 1977 .

[48]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[49]  P. Speckman Spline Smoothing and Optimal Rates of Convergence in Nonparametric Regression Models , 1985 .

[50]  C. J. Stone,et al.  Hazard Regression , 2022 .

[51]  G. Kitagawa,et al.  Akaike Information Criterion Statistics , 1988 .

[52]  M. G. Cox,et al.  Practical spline approximation , 1982 .

[53]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[54]  Joachim Engel,et al.  A minimax result for a class of nonparametric density estimators , 1995 .

[55]  C. J. Stone,et al.  Polychotomous Regression , 1995 .

[56]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .