Automatic Component Selection in Additive Modeling of French National Electricity Load Forecasting

We consider estimation and model selection in sparse high-dimensional linear additive models when multiple covariates need to be modeled nonparametrically, and propose some multi-step estimators based on B-splines approximations of the additive components. In such models, the overall number of regressors d can be large, possibly much larger than the sample size n. However, we assume that there is a smaller than n number of regressors that capture most of the impact of all covariates on the response variable. Our estimation and model selection results are valid without assuming the conventional “separation condition”—namely, without assuming that the norm of each of the true nonzero components is bounded away from zero. Instead, we relax this assumption by allowing the norms of nonzero components to converge to zero at a certain rate. The approaches investigated in this paper consist of two steps. The first step implements the variable selection, typically by the Group Lasso, and the second step applies a penalized P-splines estimation to the selected additive components. Regarding the model selection task we discuss, the application of several criteria such as Akaike information criterion (AIC), Bayesian information criterion (BIC), and generalized cross validation (GCV) and study the consistency of BIC, i.e. its ability to select the true model with probability converging to 1. We then study post-model estimation consistency of the selected components. We end the paper by applying the proposed procedure on some real data related to electricity load consumption forecasting: the EDF (Electricite de France) portfolio.

[1]  Kengo Kato Two-step estimation of high dimensional additive models , 2012, 1207.5313.

[2]  J. Lafferty,et al.  Sparse additive models , 2007, 0711.4555.

[3]  Jean-Michel Poggi,et al.  Sélection de variables dans les modèles additifs avec des estimateurs en plusieurs étapes , 2015 .

[4]  Shuheng Zhou Restricted Eigenvalue Conditions on Subgaussian Random Matrices , 2009, 0912.4045.

[5]  Jianqing Fan,et al.  Regularization of Wavelet Approximations , 2001 .

[6]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009 .

[7]  Yannig Goude,et al.  Local Short and Middle Term Electricity Load Forecasting With Semi-Parametric Additive Models , 2014, IEEE Transactions on Smart Grid.

[8]  Yang Feng,et al.  Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models , 2009, Journal of the American Statistical Association.

[9]  Pierre Pinson,et al.  Global Energy Forecasting Competition 2012 , 2014 .

[10]  Martin J. Wainwright,et al.  Minimax-Optimal Rates For Sparse Additive Models Over Kernel Classes Via Convex Programming , 2010, J. Mach. Learn. Res..

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Anestis Antoniadis,et al.  Variable Selection in Additive Models Using P-Splines , 2012, Technometrics.

[13]  Chenlei Leng,et al.  Shrinkage tuning parameter selection with a diverging number of parameters , 2008 .

[14]  Souhaib Ben Taieb,et al.  A gradient boosting approach to the Kaggle load forecasting competition , 2014 .

[15]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[16]  Rob J Hyndman,et al.  Short-Term Load Forecasting Based on a Semi-Parametric Additive Model , 2012, IEEE Transactions on Power Systems.

[17]  Raphael Nedellec,et al.  GEFCom2012: Electric load forecasting and backcasting with semi-parametric models , 2014 .

[18]  C. J. Stone,et al.  Additive Regression and Other Nonparametric Models , 1985 .

[19]  Jianqing Fan,et al.  Nonparametric Inferences for Additive Models , 2005 .

[20]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[21]  Joel L. Horowitz,et al.  Optimal estimation in additive regression models , 2006 .

[22]  A. Antoniadis,et al.  Electricity Forecasting Using Multi-Stage Estimators of Nonlinear Additive Models , 2016, IEEE Transactions on Power Systems.

[23]  Elvezio Ronchetti,et al.  Variable selection in additive models by non-negative garrote , 2011 .

[24]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[25]  V. Koltchinskii,et al.  SPARSITY IN MULTIPLE KERNEL LEARNING , 2010, 1211.2998.

[26]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[27]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[28]  J. Horowitz,et al.  VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS. , 2010, Annals of statistics.

[29]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[30]  Ryota Tomioka,et al.  Fast Convergence Rate of Multiple Kernel Learning with Elastic-net Regularization , 2011, 1103.0431.

[31]  Giampiero Marra,et al.  Practical variable selection for generalized additive models , 2011, Comput. Stat. Data Anal..

[32]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.