Linear system identification using stable spline kernels and PLQ penalties

Recently, a new regularized least squares approach to linear system identification has been introduced where the penalty term on the impulse response is defined by so called stable spline kernels. They encode information on regularity and BIBO stability, and depend on a small number of parameters that can be estimated from data. In this paper, we provide new nonsmooth formulations of the stable spline estimator. In particular, we consider linear system identification problems in a very broad context, where regularization functionals and data misfits can come from a rich set of piecewise linear quadratic functions. Moreover, our analysis includes polyhedral inequality constraints on the unknown impulse response. For any formulation in this class, we show that interior point methods can be used to solve the system identification problem, with complexity O(n3)+O(mn2) in each iteration, where n and m are the number of impulse response coefficients and measurements, respectively. The usefulness of the framework is illustrated via a numerical experiment where output measurements are contaminated by outliers.

[1]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[2]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[3]  Alessandro Chiuso,et al.  Regularized estimation of sums of exponentials in spaces generated by stable spline kernels , 2010, Proceedings of the 2010 American Control Conference.

[4]  Henrik Ohlsson,et al.  On the estimation of transfer functions, regularizations and Gaussian processes - Revisited , 2012, Autom..

[5]  Giuseppe De Nicolao,et al.  A new kernel-based approach for linear system identification , 2010, Autom..

[6]  Aleksandr Y. Aravkin,et al.  A statistical and computational theory for robust and sparse Kalman smoothing , 2011, 1111.2730.

[7]  Graham C. Goodwin,et al.  Estimated Transfer Functions with Application to Model Order Selection , 1992 .

[8]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[9]  Georgios B. Giannakis,et al.  Doubly Robust Smoothing of Dynamical Processes via Outlier Sparsity Constraints , 2011, IEEE Transactions on Signal Processing.

[10]  H. Akaike A new look at the statistical model identification , 1974 .

[11]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[12]  Aleksandr Y. Aravkin,et al.  Nonsmooth regression and state estimation using piecewise quadratic log-concave densities , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[13]  Massimiliano Pontil,et al.  Properties of Support Vector Machines , 1998, Neural Computation.

[14]  강승규,et al.  Empirical Bayes Method를 이용한 교통사고 예측모형 , 2009 .

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  G. Pillonetto,et al.  An $\ell _{1}$-Laplace Robust Kalman Smoother , 2011, IEEE Transactions on Automatic Control.

[18]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[19]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[20]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[21]  Giuseppe De Nicolao,et al.  Kernel selection in linear system identification Part I: A Gaussian process perspective , 2011, IEEE Conference on Decision and Control and European Control Conference.

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[24]  Giuseppe De Nicolao,et al.  Pitfalls of the parametric approaches exploiting cross-validation for model order selection* , 2012 .

[25]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[26]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[27]  Nimrod Megiddo,et al.  A Unified Approach to Interior Point Algorithms for Linear Complementarity Problems , 1991, Lecture Notes in Computer Science.

[28]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[29]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[30]  Aleksandr Y. Aravkin,et al.  Sparse/robust estimation and Kalman smoothing with nonsmooth log-concave densities: modeling, computation, and theory , 2013, J. Mach. Learn. Res..

[31]  J. Burke,et al.  Learning using state space kernel machines , 2011 .

[32]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[33]  Junbin Gao,et al.  Robust L1 Principal Component Analysis and Its Bayesian Variational Inference , 2008, Neural Computation.

[34]  Henrik Ohlsson,et al.  Kernel selection in linear system identification part II: A classical perspective , 2011, IEEE Conference on Decision and Control and European Control Conference.

[35]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[36]  Dennis V. Lindley,et al.  Empirical Bayes Methods , 1974 .

[37]  Wei Chu,et al.  A Unified Loss Function in Bayesian Framework for Support Vector Regression , 2001, ICML.

[38]  Alessandro Chiuso,et al.  Prediction error identification of linear systems: A nonparametric Gaussian regression approach , 2011, Autom..