On the maximum entropy property of the first-order stable spline kernel and its implications

A new nonparametric approach for system identification has been recently proposed where the impulse response is seen as the realization of a zero-mean Gaussian process whose covariance, the so-called stable spline kernel, guarantees that the impulse response is almost surely stable. Maximum entropy properties of the stable spline kernel have been pointed out in the literature. In this paper we provide an independent proof that relies on the theory of matrix extension problems in the graphical model literature and leads to a closed form expression for the inverse of the first order stable spline kernel as well as to a new factorization in the form UWU with U upper triangular and W diagonal. Interestingly, all first-order stable spline kernels share the same factor U and W admits a closed form representation in terms of the kernel hyperparameter, making the factorization computationally inexpensive. Maximum likelihood properties of the stable spline kernel are also highlighted. These results can be applied both to improve the stability and to reduce the computational complexity associated with the computation of stable spline estimators.

[1]  Henrik Ohlsson,et al.  Kernel selection in linear system identification part II: A classical perspective , 2011, IEEE Conference on Decision and Control and European Control Conference.

[2]  Lennart Ljung,et al.  On the estimation of hyperparameters for Bayesian system identification with exponentially decaying kernels , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[3]  Alessandro Chiuso,et al.  Convex vs non-convex estimators for regression and sparse estimation: the mean squared error properties of ARD and GLasso , 2014, J. Mach. Learn. Res..

[4]  Alessandro Chiuso,et al.  Prediction error identification of linear systems: A nonparametric Gaussian regression approach , 2011, Autom..

[5]  Francesca P. Carli,et al.  A Maximum Entropy Solution of the Covariance Extension Problem for Reciprocal Processes , 2011, IEEE Transactions on Automatic Control.

[6]  H. Akaike A new look at the statistical model identification , 1974 .

[7]  R. Möhring Algorithmic graph theory and perfect graphs , 1986 .

[8]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[9]  Katsuki Fujisawa,et al.  Exploiting sparsity in semidefinite programming via matrix completion II: implementation and numerical results , 2003, Math. Program..

[10]  Francesca P. Carli,et al.  Efficient algorithms for large scale linear system identification using stable spline estimators , 2012 .

[11]  Lennart Ljung,et al.  Implementation of algorithms for tuning parameters in regularized least squares problems in system identification , 2013, Autom..

[12]  C. Richard Johnson,et al.  Matrix Completion Problems: A Survey , 1990 .

[13]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[14]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[15]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[16]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[17]  Henrik Ohlsson,et al.  On the estimation of transfer functions, regularizations and Gaussian processes - Revisited , 2012, Autom..

[18]  Charles R. Johnson,et al.  Determinantal formulae for matrix completions associated with chordal graphs , 1989 .

[19]  Kazuo Murota,et al.  Exploiting Sparsity in Semidefinite Programming via Matrix Completion I: General Framework , 2000, SIAM J. Optim..

[20]  Thomas A. Louis,et al.  Empirical Bayes Methods , 2006 .

[21]  I. Gohberg,et al.  Classes of Linear Operators , 1990 .

[22]  Alessandro Chiuso,et al.  Regularized estimation of sums of exponentials in spaces generated by stable spline kernels , 2010, Proceedings of the 2010 American Control Conference.

[23]  Giuseppe De Nicolao,et al.  A new kernel-based approach for linear system identification , 2010, Autom..

[24]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[26]  Vwani P. Roychowdhury,et al.  Covariance selection for nonchordal graphs via chordal embedding , 2008, Optim. Methods Softw..

[27]  H. Dym,et al.  Extensions of band matrices with band inverses , 1981 .

[28]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[29]  Giuseppe De Nicolao,et al.  Kernel selection in linear system identification Part I: A Gaussian process perspective , 2011, IEEE Conference on Decision and Control and European Control Conference.

[30]  Charles R. Johnson,et al.  Positive definite completions of partial Hermitian matrices , 1984 .

[31]  Francesca P. Carli,et al.  On the Covariance Completion Problem Under a Circulant Structure , 2011, IEEE Transactions on Automatic Control.

[32]  Francesca P. Carli,et al.  An Efficient Algorithm for Maximum-Entropy Extension of Block-Circulant Covariance Matrices , 2011, ArXiv.