Sparse model construction using coordinate descent optimization

We propose a new sparse model construction method aimed at maximizing a model's generalisation capability for a large class of linear-in-the-parameters models. The coordinate descent optimization algorithm is employed with a modified l1- penalized least squares cost function in order to estimate a single parameter and its regularization parameter simultaneously based on the leave one out mean square error (LOOMSE). Our original contribution is to derive a closed form of optimal LOOMSE regularization parameter for a single term model, for which we show that the LOOMSE can be analytically computed without actually splitting the data set leading to a very simple parameter estimation method. We then integrate the new results within the coordinate descent optimization algorithm to update model parameters one at the time for linear-in-the-parameters models. Consequently a fully automated procedure is achieved without resort to any other validation data set for iterative model evaluation. Illustrative examples are included to demonstrate the effectiveness of the new approaches.

[1]  António E. Ruano,et al.  Intelligent Control Systems using Computational Intelligence Techniques , 2005 .

[2]  Paul Sharkey,et al.  Automatic nonlinear predictive model-construction algorithm using forward regression and the PRESS statistic , 2003 .

[3]  Roderick Murray-Smith,et al.  Multiple Model Approaches to Modelling and Control , 1997 .

[4]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[5]  M. Korenberg Identifying nonlinear difference equation and functional expansion representations: The fast orthogonal algorithm , 2006, Annals of Biomedical Engineering.

[6]  Sheng Chen,et al.  Combined genetic algorithm optimization and regularized orthogonal least squares learning for radial basis function networks , 1999, IEEE Trans. Neural Networks.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Martin Brown,et al.  Neurofuzzy adaptive modelling and control , 1994 .

[9]  Felipe M. Pait Functional adaptive control - an intelligent systems approach: Simon G. Fabri and Visakan Kadirkamanathan; Springer-Verlag, London, 2001, ISBN 1-85233-438-X , 2002, Autom..

[10]  Anneli Folkesson Modelling of Dynamic Systems , 2013 .

[11]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[12]  Steve A. Billings,et al.  Parameter estimation based on stacked regression and evolutionary algorithms , 1999 .

[13]  L X Wang,et al.  Fuzzy basis functions, universal approximation, and orthogonal least-squares learning , 1992, IEEE Trans. Neural Networks.

[14]  Sheng Chen,et al.  Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[15]  R. H. Myers Classical and modern regression with applications , 1986 .

[16]  Visakan Kadirkamanathan,et al.  Functional Adaptive Control: An Intelligent Systems Approach , 2012 .

[17]  Qinghua Zhang,et al.  Using wavelet network in nonparametric estimation , 1997, IEEE Trans. Neural Networks.

[18]  Mark J. L. Orr,et al.  Regularization in the Selection of Radial Basis Function Centers , 1995, Neural Computation.

[19]  Xia Hong,et al.  Adaptive Modelling, Estimation and Fusion from Data: A Neurofuzzy Approach , 2002, Advanced information processing.

[20]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[21]  C.J.H. Mann,et al.  Adaptive Modelling Estimation and Fusion from Data: A Neurofuzzy Approach , 2003 .

[22]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[23]  Visakan Kadirkamanathan,et al.  Functional Adaptive Control , 2001 .

[24]  Xia Hong,et al.  Adaptive Modelling, Estimation and Fusion from Data , 2002, Advanced Information Processing.

[25]  Chris J. Harris,et al.  Neurofuzzy design and model construction of nonlinear dynamical processes from data , 2001 .

[26]  S. A. Billings,et al.  The wavelet-NARMAX representation: A hybrid model structure combining polynomial models with multiresolution wavelet decompositions , 2005, Int. J. Syst. Sci..

[27]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[28]  Sheng Chen,et al.  Sparse kernel regression modeling using combined locally regularized orthogonal least squares and D-optimality experimental design , 2003, IEEE Trans. Autom. Control..

[29]  Peter Cheeseman,et al.  Bayesian Methods for Adaptive Models , 2011 .

[30]  Sheng Chen Locally regularised orthogonal least squares algorithm for the construction of sparse kernel regression models , 2002, 6th International Conference on Signal Processing, 2002..

[31]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .