Learning Nonlinear Generative Models of Time Series With a Kalman Filter in RKHS

This paper presents a novel generative model for time series based on the Kalman filter algorithm in a reproducing kernel Hilbert space (RKHS) using the conditional embedding operator. The end result is a nonlinear model that quantifies the hidden state uncertainty and propagates its probability distribution forward as in the Kalman algorithm. The embedded dynamics can be described by the estimated conditional embedding operator constructed directly from the training measurement data. Using this operator as the counterpart of the state transition matrix, we reformulate the Kalman filter algorithm in RKHS. For the state model, the hidden states are the estimated embeddings of the measurement distribution, while the measurement model serves to connect the estimated measurement embeddings with the current mapped measurements in the RKHS. This novel algorithm is applied to noisy time-series estimation and prediction, and simulation results show that it outperforms other existing algorithms. In addition, improvements are proposed to reduce the size of the operator and reduce the computation complexity.

[1]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[2]  L. Ralaivola,et al.  Time series filtering, smoothing and learning using the kernel Kalman filter , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[3]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[4]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[5]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[6]  J. L. Roux An Introduction to the Kalman Filter , 2003 .

[7]  Weifeng Liu,et al.  Kernel Adaptive Filtering: A Comprehensive Introduction , 2010 .

[8]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[9]  S. Haykin,et al.  Making sense of a complex world [chaotic events modeling] , 1998, IEEE Signal Process. Mag..

[10]  Alexander Zien,et al.  Learning to Find Graph Pre-images , 2004, DAGM-Symposium.

[11]  Warwick Tucker,et al.  Foundations of Computational Mathematics a Rigorous Ode Solver and Smale's 14th Problem , 2022 .

[12]  Weifeng Liu,et al.  Kernel Adaptive Filtering , 2010 .

[13]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[14]  Alexander J. Smola,et al.  Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[15]  Bernhard Schölkopf,et al.  Learning to Find Pre-Images , 2003, NIPS.

[16]  Rudolph van der Merwe,et al.  Dual Estimation and the Unscented Transformation , 1999, NIPS.

[17]  Badong Chen,et al.  A novel extended kernel recursive least squares algorithm , 2012, Neural Networks.

[18]  Kenji Fukumizu,et al.  Universality, Characteristic Kernels and RKHS Embedding of Measures , 2010, J. Mach. Learn. Res..

[19]  S. Haykin,et al.  Kernel Least‐Mean‐Square Algorithm , 2010 .

[20]  C. L. Nikias,et al.  Signal processing with fractional lower order moments: stable processes and their applications , 1993, Proc. IEEE.

[21]  D. Basak,et al.  Support Vector Regression , 2008 .

[22]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[23]  Le Song,et al.  Kernel Bayes' Rule , 2010, NIPS.

[24]  Weifeng Liu,et al.  Extended Kernel Recursive Least Squares Algorithm , 2009, IEEE Transactions on Signal Processing.

[25]  S. Haykin,et al.  Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.

[26]  Simon Haykin,et al.  Making sense of a complex world , 1998 .

[27]  Paulo Sergio Ramirez,et al.  Fundamentals of Adaptive Filtering , 2002 .

[28]  Petros G. Voulgaris,et al.  On optimal ℓ∞ to ℓ∞ filtering , 1995, Autom..

[29]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[30]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[31]  Badong Chen,et al.  Quantized Kernel Least Mean Square Algorithm , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[32]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[33]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[34]  Liva Ralaivola,et al.  Dynamical Modeling with Kernels for Nonlinear Time Series Prediction , 2003, NIPS.

[35]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[36]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[37]  Michael Rabadi,et al.  Kernel Methods for Machine Learning , 2015 .

[38]  K. Ikeda Multiple-valued stationary state and its instability of the transmitted light by a ring cavity system , 1979 .

[39]  E. Lorenz Deterministic nonperiodic flow , 1963 .

[40]  Simon Haykin,et al.  Cubature Kalman Filtering for Continuous-Discrete Systems: Theory and Simulations , 2010, IEEE Transactions on Signal Processing.

[41]  Yuesheng Xu,et al.  Universal Kernels , 2006, J. Mach. Learn. Res..

[42]  Jeffrey K. Uhlmann,et al.  New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[43]  S. R. Searle,et al.  On Deriving the Inverse of a Sum of Matrices , 1981 .

[44]  C. L. Nikias,et al.  Signal processing with alpha-stable distributions and applications , 1995 .