l1-norm penalised orthogonal forward regression

ABSTRACT A l1-norm penalised orthogonal forward regression (l1-POFR) algorithm is proposed based on the concept of leave-one-out mean square error (LOOMSE), by defining a new l1-norm penalised cost function in the constructed orthogonal space and associating each orthogonal basis with an individually tunable regularisation parameter. Due to orthogonality, the LOOMSE can be analytically computed without actually splitting the data-set, and moreover a closed form of the optimal regularisation parameter is derived by greedily minimising the LOOMSE incrementally. We also propose a simple formula for adaptively detecting and removing regressors to an inactive set so that the computational cost of the algorithm is significantly reduced. Examples are included to demonstrate the effectiveness of this new l1-POFR approach.

[1]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[2]  C. Harris,et al.  Construction of Tunable Radial Basis Function Networks Using Orthogonal Forward Selection , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Lawrence Carin,et al.  Bayesian Compressive Sensing , 2008, IEEE Transactions on Signal Processing.

[4]  Sheng Chen,et al.  Sparse modeling using orthogonal forward regression with PRESS statistic and regularization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Sheng Chen,et al.  Sparse kernel regression modeling using combined locally regularized orthogonal least squares and D-optimality experimental design , 2003, IEEE Trans. Autom. Control..

[6]  Paul Sharkey,et al.  Automatic nonlinear predictive model-construction algorithm using forward regression and the PRESS statistic , 2003 .

[7]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[8]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[9]  Sheng Chen,et al.  Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[10]  S. A. Billings,et al.  The identification of linear and non-linear models of a turbocharged automotive diesel engine , 1989 .

[11]  Sheng Chen,et al.  Representations of non-linear systems: the NARMAX model , 1989 .

[12]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[13]  J. Sherman,et al.  Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix , 1950 .

[14]  Peter Cheeseman,et al.  Bayesian Methods for Adaptive Models , 2011 .

[15]  Glenn Fung,et al.  On the Dangers of Cross-Validation. An Experimental Evaluation , 2008, SDM.

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.