Component selection and smoothing in smoothing spline analysis of variance models -- COSSO

We propose a new method for model selection and model fitting in nonparametric regression models, in the framework of smoothing spline ANOVA. The “COSSO” is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditional smoothing spline method. The COSSO provides a unified framework for several recent proposals for model selection in linear models and smoothing spline ANOVA models. Theoretical properties, such as the existence and the rate of convergence of the COSSO estimator, are studied. In the special case of a tensor product design with periodic functions, a detailed analysis reveals that the COSSO applies a novel soft thresholding type operation to the function components and selects the correct model structure with probability tending to one. We give an equivalent formulation of the COSSO estimator which leads naturally to an iterative algorithm. We compare the COSSO with the MARS, a popular method that builds functional ANOVA models, in simulations and real examples. The COSSO gives very competitive performances in these studies.

[1]  E. Wegman Nonparametric probability density estimation , 1972 .

[2]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[3]  G. Wahba,et al.  A completely automatic french curve: fitting spline functions by cross validation , 1975 .

[4]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[5]  G. Wahba Bayesian "Confidence Intervals" for the Cross-validated Smoothing Spline , 1983 .

[6]  F. Utreras Natural spline functions, their associated eigenvalue problem , 1983 .

[7]  Anne Lohrli Chapman and Hall , 1985 .

[8]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[9]  Douglas Nychka,et al.  Bayesian Confidence Intervals for Smoothing Splines , 1988 .

[10]  李幼升,et al.  Ph , 1989 .

[11]  R. Tibshirani,et al.  Linear Smoothers and Additive Models , 1989 .

[12]  G. Wahba Spline models for observational data , 1990 .

[13]  Zehua Chen Interaction Spline Models and Their Convergence Rates , 1991 .

[14]  Chong Gu Diagnostics for Nonparametric Regression Models with Additive Terms , 1992 .

[15]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[16]  J. Friedman,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Response , 1993 .

[17]  G. Wahba,et al.  Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy : the 1994 Neyman Memorial Lecture , 1995 .

[18]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  G. Wahba,et al.  A GENERALIZED APPROXIMATE CROSS VALIDATION FOR SMOOTHING SPLINES WITH NON-GAUSSIAN DATA , 1996 .

[21]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[22]  Sergey Bakin,et al.  Adaptive regression and model selection in data mining problems , 1999 .

[23]  Jianqing Fan,et al.  Variable Selection via Penalized Likelihood , 1999 .

[24]  Xiwu Lin,et al.  Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV , 2000 .

[25]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[26]  David Ruppert,et al.  Theory & Methods: Spatially‐adaptive Penalties for Spline Fitting , 2000 .

[27]  Yi Lin Tensor product space ANOVA models , 2000 .

[28]  S. R. Jammalamadaka,et al.  Empirical Processes in M-Estimation , 2001 .

[29]  Dong Xiang,et al.  Cross-Validating Non-Gaussian Data: Generalized Approximate Cross-Validation Revisited , 2001 .

[30]  Chong Gu Smoothing Spline Anova Models , 2002 .

[31]  Robert Kohn,et al.  Bayesian Variable Selection and Model Averaging in High-Dimensional Multinomial Nonparametric Regression , 2003 .

[32]  Meta M. Voelker,et al.  Variable Selection and Model Building via Likelihood Basis Pursuit , 2004 .

[33]  Steve R. Gunn,et al.  Structural Modelling with Sparse Kernels , 2002, Machine Learning.