Twenty years of P-splines

P-splines first appeared in the limelight twenty years ago. Since then they have become popular in applications and in theoretical work. The combination of a rich B-spline basis and a simple difference penalty lends itself well to a variety of generalizations, because it is based on regression. In effect, P-splines allow the building of a “backbone” for the “mixing and matching” of a variety of additive smooth structure components, while inviting all sorts of extensions: varying-coefficient effects, signal (functional) regressors, two-dimensional surfaces, non-normal responses, quantile (expectile) modelling, among others. Strong connections with mixed models and Bayesian analysis have been established. We give an overview of many of the central developments during the first two decades of P-splines.

[1]  G. Kauermann,et al.  A Note on Penalized Spline Smoothing With Correlated Errors , 2007 .

[2]  Torsten Hothorn,et al.  The functional linear array model , 2015 .

[3]  Xuewen Lu,et al.  A class of partially linear single‐index survival models , 2006 .

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  David Ruppert,et al.  Theory & Methods: Spatially‐adaptive Penalties for Spline Fitting , 2000 .

[6]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[7]  T. Kneib,et al.  BayesX: Analyzing Bayesian Structural Additive Regression Models , 2005 .

[8]  Philippe Lambert,et al.  Smooth semiparametric and nonparametric Bayesian estimation of bivariate densities from bivariate histogram data , 2011, Comput. Stat. Data Anal..

[9]  B. Li,et al.  Sharpening P-spline signal regression , 2008 .

[10]  David Ruppert,et al.  Variable Selection and Function Estimation in Additive Nonparametric Regression Using a Data-Based Prior: Comment , 1999 .

[11]  B. Marx,et al.  Multidimensional single-index signal regression , 2011 .

[12]  Paul H. C. Eilers,et al.  Fast and compact smoothing on large multidimensional grids , 2006, Comput. Stat. Data Anal..

[13]  Jutta Gampe,et al.  Bilinear modulation models for seasonal tables of counts , 2010, Stat. Comput..

[14]  S. Lang,et al.  Bayesian P-Splines , 2004 .

[15]  Torsten Hothorn,et al.  Boosting additive models using component-wise P-Splines , 2008, Comput. Stat. Data Anal..

[16]  David Ruppert,et al.  On the asymptotics of penalized spline smoothing , 2011 .

[17]  Paul H. C. Eilers,et al.  Generalized Linear Models with P-splines , 1992 .

[18]  Simon N. Wood,et al.  Shape constrained additive models , 2015, Stat. Comput..

[19]  Stefan Lang,et al.  Simultaneous selection of variables and smoothing parameters in structured additive regression models , 2008, Comput. Stat. Data Anal..

[20]  Liping Zhu,et al.  Penalized Spline Estimation for Varying-Coefficient Models , 2008 .

[21]  J. Ramsay,et al.  The historical functional linear model , 2003 .

[22]  Jiguo Cao,et al.  Parameter estimation for differential equations: a generalized smoothing approach , 2007 .

[23]  H. D. Patterson,et al.  Recovery of inter-block information when block sizes are unequal , 1971 .

[24]  Christel Faes,et al.  On the estimation of the reproduction number based on misreported epidemic data , 2014, Statistics in medicine.

[25]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[26]  Jianming Ye On Measuring and Correcting the Effects of Data Mining and Model Selection , 1998 .

[27]  Stanley R. Johnson,et al.  Varying Coefficient Models , 1984 .

[28]  P. Eilers A perfect smoother. , 2003, Analytical chemistry.

[29]  H C EilersPaul,et al.  Generalized linear regression on sampled signals and curves , 1999 .

[30]  S. Wood Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models , 2011 .

[31]  R. Tibshirani,et al.  Varying‐Coefficient Models , 1993 .

[32]  D. Ruppert,et al.  Spatially Adaptive Bayesian Penalized Splines With Heteroscedastic Errors , 2007 .

[33]  D. Ruppert Selecting the Number of Knots for Penalized Splines , 2002 .

[34]  Paul H. C. Eilers,et al.  Efficient two-dimensional smoothing with PP-spline ANOVA mixed models and nested bases , 2013, Comput. Stat. Data Anal..

[35]  S. Wood Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models , 2004 .

[36]  Thomas Kneib,et al.  Bayesian geoadditive sample selection models , 2010 .

[37]  D. Harville Maximum Likelihood Approaches to Variance Component Estimation and to Related Problems , 1977 .

[38]  Per Christian Hansen,et al.  Analysis of Discrete Ill-Posed Problems by Means of the L-Curve , 1992, SIAM Rev..

[39]  D. Ruppert,et al.  Penalized Spline Estimation for Partially Linear Single-Index Models , 2002 .

[40]  S. Wood,et al.  Generalized additive models for large data sets , 2015 .

[41]  G. Kauermann,et al.  Estimating the term structure of interest rates using penalized splines , 2006 .

[42]  R. Rigby,et al.  Generalized additive models for location, scale and shape , 2005 .

[43]  L. Fahrmeir,et al.  PENALIZED STRUCTURED ADDITIVE REGRESSION FOR SPACE-TIME DATA: A BAYESIAN PERSPECTIVE , 2004 .

[44]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[45]  M. Wand,et al.  ON SEMIPARAMETRIC REGRESSION WITH O'SULLIVAN PENALIZED SPLINES , 2007 .

[46]  M. Durbán,et al.  Modeling regional economic dynamics: Spatial dependence, spatial heterogeneity and nonlinearities , 2014 .

[47]  Finbarr O'Sullivan,et al.  [A Statistical Perspective on Ill-Posed Inverse Problems]: Rejoinder , 1986 .

[48]  S. Wood ON CONFIDENCE INTERVALS FOR GENERALIZED ADDITIVE MODELS BASED ON PENALIZED REGRESSION SPLINES , 2006 .

[49]  Göran Kauermann,et al.  On confidence intervals for semiparametric expectile regression , 2013, Stat. Comput..

[50]  Nathan Whitehorn,et al.  Penalized splines for smooth representation of high-dimensional Monte Carlo datasets , 2013, Comput. Phys. Commun..

[51]  P. Eilers,et al.  Smoothing of X-ray diffraction data and K (alpha)2 elimination using penalized likelihood and the composite link model , 2014 .

[52]  B. Marx,et al.  Modulation models for seasonal time series and incidence tables , 2008, Statistics in medicine.

[53]  Jeffrey S. Morris Functional Regression , 2014, 1406.4068.

[54]  M. Wand,et al.  Geoadditive models , 2003 .

[55]  María Durbán,et al.  A note on P-spline additive models with correlated errors , 2003, Comput. Stat..

[56]  Paul H. C. Eilers,et al.  Direct generalized additive modeling with penalized likelihood , 1998 .

[57]  M. Durbán,et al.  Generalized linear array models with applications to multidimensional smoothing , 2006 .

[58]  Philippe Lambert,et al.  Robust specification of the roughness penalty prior distribution in spatially adaptive Bayesian P-splines models , 2007, Comput. Stat. Data Anal..

[59]  Eric R. Ziegel,et al.  An Introduction to Generalized Linear Models , 2002, Technometrics.

[60]  Irène Gijbels,et al.  P-splines quantile regression estimation in varying coefficient models , 2014 .

[61]  R. Schall Estimation in generalized linear models with random effects , 1991 .

[62]  Jutta Gampe,et al.  Modelling general patterns of digit preference , 2008 .

[63]  Matt P. Wand,et al.  Smoothing and mixed models , 2003, Comput. Stat..

[64]  G. Tutz,et al.  Generalized Additive Modeling with Implicit Variable Selection by Likelihood‐Based Boosting , 2006, Biometrics.

[65]  Brian D. Marx,et al.  Generalized Linear Regression on Sampled Signals and Curves: A P-Spline Approach , 1999, Technometrics.

[66]  Paul H. C. Eilers,et al.  L- and V-curves for optimal smoothing , 2015 .

[67]  Paul H. C. Eilers,et al.  3D space-varying coefficient models with application to diffusion tensor imaging , 2007, Comput. Stat. Data Anal..

[68]  M. Wand,et al.  Respiratory health and air pollution: additive mixed model analyses. , 2001, Biostatistics.

[69]  L. Fahrmeir,et al.  Some asymptotic results on generalized penalized spline smoothing , 2007 .

[70]  Paul H. C. Eilers,et al.  Smoothing and forecasting mortality rates , 2004 .

[71]  P. Green Penalized Likelihood for General Semi-Parametric Regression Models. , 1987 .

[72]  Yuedong Wang Mixed effects smoothing spline analysis of variance , 1998 .

[73]  Thomas Kneib,et al.  Geoadditive expectile regression , 2012, Comput. Stat. Data Anal..

[74]  Paul H. C. Eilers,et al.  Quantile smoothing of array CGH data , 2005, Bioinform..

[75]  Gerhard Tutz,et al.  Estimation of single-index models based on boosting techniques , 2011 .

[76]  Jutta Gampe,et al.  Efficient Estimation of Smooth Distributions From Coarsely Grouped Data , 2015, American journal of epidemiology.

[77]  Jaroslaw Harezlak,et al.  Penalized solutions to functional regression problems , 2007, Comput. Stat. Data Anal..

[78]  P. Eilers,et al.  Quantile regression with monotonicity restrictions using P-splines and the L1-norm , 2006 .

[79]  Ana-Maria Staicu,et al.  Functional Additive Mixed Models , 2012, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[80]  J. Meulman,et al.  Reliable Single Chip Genotyping with Semi-Parametric Log-Concave Mixtures , 2012, PloS one.

[81]  Robert Kohn,et al.  The Performance of Cross-Validation and Maximum Likelihood Estimators of Spline Smoothing Parameters , 1991 .

[82]  Ludwig Fahrmeir,et al.  Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection , 2010, Stat. Comput..

[83]  Andreas Brezger,et al.  Generalized structured additive regression based on Bayesian P-splines , 2006, Comput. Stat. Data Anal..

[84]  G. Kauermann A note on smoothing parameter selection for penalized spline smoothing , 2005 .

[85]  Mathew W. McLean,et al.  Journal of Computational and Graphical Statistics Functional Generalized Additive Models Functional Generalized Additive Models Accepted Manuscript Accepted Manuscript , 2022 .

[86]  L. Fahrmeir,et al.  High dimensional structured additive regression models: Bayesian regularization, smoothing and predictive performance , 2011 .

[87]  H. C. Eilers The Smooth Complex Logarithm and Quasi-Periodic Models , 2010 .

[88]  Paul H. C. Eilers,et al.  Fast smoothing parameter separation in multidimensional generalized P-splines: the SAP algorithm , 2014, Statistics and Computing.

[89]  Ludwig Fahrmeir,et al.  Propriety of posteriors in structured additive regression models: Theory and empirical evidence , 2009 .

[90]  Céline Bugli,et al.  Functional ANOVA with random functional effects: an application to event‐related potentials modelling for electroencephalograms analysis , 2006, Statistics in medicine.

[91]  P. Eilers Unimodal smoothing , 2022 .

[92]  Paul H. C. Eilers,et al.  Multidimensional Penalized Signal Regression , 2005, Technometrics.

[93]  Edmund Taylor Whittaker On a New Method of Graduation , 1922, Proceedings of the Edinburgh Mathematical Society.

[94]  D. Ruppert,et al.  Flexible Copula Density Estimation with Penalized Hierarchical B‐splines , 2013 .

[95]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[96]  Bin Li,et al.  Multivariate calibration with single-index signal regression , 2009 .

[97]  Paul H. C. Eilers,et al.  Optimal expectile smoothing , 2009, Comput. Stat. Data Anal..

[98]  Paul H. C. Eilers,et al.  Visualization of Genomic Changes by Segmented Smoothing Using an L 0 Penalty , 2012, PloS one.

[99]  Chong Gu Smoothing Spline Anova Models , 2002 .

[100]  Paul H. C. Eilers,et al.  Simultaneous estimation of quantile curves using quantile sheets , 2013 .

[101]  B. Marx,et al.  Multivariate calibration with temperature interaction using two-dimensional penalized signal regression , 2003 .

[102]  C. R. Henderson,et al.  Best linear unbiased estimation and prediction under a selection model. , 1975, Biometrics.

[103]  Gerda Claeskens,et al.  Asymptotic properties of penalized spline estimators , 2009 .

[104]  Paul H. C. Eilers,et al.  Splines, knots, and penalties , 2010 .

[105]  Paul H. C. Eilers,et al.  Enhancing scatterplots with smoothed densities , 2004, Bioinform..

[106]  Terry Speed,et al.  [That BLUP is a Good Thing: The Estimation of Random Effects]: Comment , 1991 .

[107]  D. Ruppert,et al.  On the asymptotics of penalized splines , 2008 .

[108]  Göran Kauermann,et al.  Additive two-way hazards model with varying coefficients , 2006, Comput. Stat. Data Anal..

[109]  Luo Xiao,et al.  Fast bivariate P‐splines: the sandwich smoother , 2013 .

[110]  Göran Kauermann,et al.  Density estimation and comparison with a penalized mixture approach , 2012, Comput. Stat..

[111]  Carlo G. Camarda,et al.  MortalitySmooth: An R Package for Smoothing Poisson Counts with P-Splines , 2012 .

[112]  D. Billheimer Functional Data Analysis, 2nd edition edited by J. O. Ramsay and B. W. Silverman , 2007 .

[113]  Filtering Time Series with Penalized Splines , 2011 .

[114]  Paul H. C. Eilers,et al.  Bayesian density estimation from grouped continuous data , 2009, Comput. Stat. Data Anal..

[115]  Robin Thompson,et al.  Composite Link Functions in Generalized Linear Models , 1981 .

[116]  Raymond H. Myers Classical and modern regression with applications , 1986 .

[117]  Brian D Marx,et al.  Generalized Linear Additive Smooth Structures , 2002 .

[118]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[119]  Göran Kauermann,et al.  Penalized spline smoothing in multivariable survival models with varying coefficients , 2005, Comput. Stat. Data Anal..

[120]  J. Raz,et al.  Semiparametric Stochastic Mixed Models for Longitudinal Data , 1998 .

[121]  M. Kenward,et al.  The Analysis of Designed Experiments and Longitudinal Data by Using Smoothing Splines , 1999 .

[122]  Varying-coefficient single-index signal regression , 2015 .

[123]  Paul H. C. Eilers,et al.  Ill-posed problems with counts, the composite link model and penalized likelihood , 2007 .

[124]  M. Wand,et al.  Simple fitting of subject‐specific curves for longitudinal data , 2005, Statistics in medicine.

[125]  David Ruppert,et al.  Semiparametric regression during 2003-2007. , 2009, Electronic journal of statistics.

[126]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[127]  Paul H. C. Eilers,et al.  Non-parametric log-concave mixtures , 2007, Comput. Stat. Data Anal..

[128]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[129]  S. Martino Approximate Bayesian Inference for Latent Gaussian Models , 2007 .

[130]  Adrian G. Barnett,et al.  An Introduction to Generalized Linear Models, Third Edition , 1990 .

[131]  Dae-Jin Lee,et al.  P-spline ANOVA-type interaction models for spatio-temporal smoothing , 2011 .

[132]  Jos T. A. Verhoeven,et al.  Early plant recruitment stages set the template for the development of vegetation patterns along a hydrological gradient , 2015 .

[133]  Theory & Methods: Krige, Smooth, Both or Neither? , 2000 .

[134]  L. Fahrmeir,et al.  Structured Additive Regression for Categorical Space–Time Data: A Mixed Model Approach , 2006 .

[135]  Fabian Scheipl,et al.  Straightforward intermediate rank tensor product smoothing in mixed models , 2012, Statistics and Computing.

[136]  Brian D. Marx,et al.  P-spline Varying Coefficient Models for Complex Data , 2010 .

[137]  Anestis Antoniadis,et al.  BAYESIAN ESTIMATION IN SINGLE-INDEX MODELS , 2004 .

[138]  Benjamin Hofner,et al.  Generalized additive models for location, scale and shape for high dimensional data—a flexible approach based on boosting , 2012 .

[139]  M. Durbán,et al.  Flexible smoothing with P-splines: a unified approach , 2002 .

[140]  Bin Wang,et al.  Bayesian generalized varying coefficient models for longitudinal proportional data with errors-in-covariates , 2014 .

[141]  David Ruppert,et al.  Estimating the Interest Rate Term Structure of Corporate Debt With a Semiparametric Penalized Spline Model , 2004 .

[142]  Benjamin Hofner,et al.  Model-based boosting in R: a hands-on tutorial using the R package mboost , 2012, Computational Statistics.

[143]  María Durbán,et al.  Smooth-CAR mixed models for spatial count data , 2008, Comput. Stat. Data Anal..