The Kernel Adaptive Autoregressive-Moving-Average Algorithm

In this paper, we present a novel kernel adaptive recurrent filtering algorithm based on the autoregressive-moving-average (ARMA) model, which is trained with recurrent stochastic gradient descent in the reproducing kernel Hilbert spaces. This kernelized recurrent system, the kernel adaptive ARMA (KAARMA) algorithm, brings together the theories of adaptive signal processing and recurrent neural networks (RNNs), extending the current theory of kernel adaptive filtering (KAF) using the representer theorem to include feedback. Compared with classical feedforward KAF methods, the KAARMA algorithm provides general nonlinear solutions for complex dynamical systems in a state-space representation, with a deferred teacher signal, by propagating forward the hidden states. We demonstrate its capabilities to provide exact solutions with compact structures by solving a set of benchmark nondeterministic polynomial-complete problems involving grammatical inference. Simulation results show that the KAARMA algorithm outperforms equivalent input-space recurrent architectures using first- and second-order RNNs, demonstrating its potential as an effective learning solution for the identification and synthesis of deterministic finite automata.

[1]  Xiao-Fei Liao,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Badong Chen,et al.  Universal Approximation with Convex Optimization: Gimmick or Reality? [Discussion Forum] , 2015, IEEE Computational Intelligence Magazine.

[3]  Austin J. Brockmeier,et al.  A Tensor-Product-Kernel Framework for Multiscale Neural Activity Decoding and Control , 2014, Comput. Intell. Neurosci..

[4]  José Luis Rojo-Álvarez,et al.  Explicit Recursive and Adaptive Filtering in Reproducing Kernel Hilbert Spaces , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Badong Chen,et al.  A FIXED-BUDGET QUANTIZED KERNEL LEAST MEAN SQUARE ALGORITHM , 2012 .

[6]  José Carlos Príncipe,et al.  Kernel recurrent system trained by real-time recurrent learning algorithm , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Badong Chen,et al.  Quantized Kernel Recursive Least Squares Algorithm , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Xiaohong Jiang,et al.  Generalized Two-Hop Relay for Flexible Delay Control in MANETs , 2012, IEEE/ACM Transactions on Networking.

[9]  Badong Chen,et al.  A novel extended kernel recursive least squares algorithm , 2012, Neural Networks.

[10]  Miguel Lázaro-Gredilla,et al.  Kernel Recursive Least-Squares Tracker for Time-Varying Regression , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Badong Chen,et al.  Online efficient learning with quantized KLMS and L1 regularization , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[12]  Xiangping Zeng,et al.  Low-Complexity Nonlinear Adaptive Filter Based on a Pipelined Bilinear Recurrent Neural Network , 2011, IEEE Transactions on Neural Networks.

[13]  Iickho Song,et al.  Identification of Finite State Automata With a Class of Recurrent Neural Networks , 2010, IEEE Transactions on Neural Networks.

[14]  S. Haykin,et al.  Kernel Least‐Mean‐Square Algorithm , 2010 .

[15]  Weifeng Liu,et al.  Kernel Adaptive Filtering: A Comprehensive Introduction , 2010 .

[16]  Weifeng Liu,et al.  Kernel Adaptive Filtering , 2010 .

[17]  Weifeng Liu,et al.  An Information Theoretic Approach of Designing Sparse Kernel Adaptive Filters , 2009, IEEE Transactions on Neural Networks.

[18]  Weifeng Liu,et al.  Extended Kernel Recursive Least Squares Algorithm , 2009, IEEE Transactions on Signal Processing.

[19]  Alexander J. Smola,et al.  Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[20]  Paul Honeine,et al.  Online Prediction of Time Series Data With Kernels , 2009, IEEE Transactions on Signal Processing.

[21]  Eilon Vaadia,et al.  Kernel-ARMA for Hand Tracking and Brain-Machine interfacing During 3D Motor Control , 2008, NIPS.

[22]  Matthew H Tong,et al.  2007 Special Issue: Learning grammatical structure with Echo State Networks , 2007 .

[23]  Ignacio Santamaría,et al.  Nonlinear System Identification using a New Sliding-Window Kernel RLS Algorithm , 2007, J. Commun..

[24]  José Luis Rojo-Álvarez,et al.  Support Vector Machines for Nonlinear Kernel ARMA System Identification , 2006, IEEE Transactions on Neural Networks.

[25]  L. Ralaivola,et al.  Time series filtering, smoothing and learning using the kernel Kalman filter , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[26]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[27]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[28]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[29]  John F. Kolen,et al.  Field Guide to Dynamical Recurrent Networks , 2001 .

[30]  Ryohei Nakano,et al.  Stable behavior in a recurrent neural network for a finite state machine , 2000, Neural Networks.

[31]  James R. Zeidler,et al.  Adaptive tracking of linear time-variant systems by extended RLS algorithms , 1997, IEEE Trans. Signal Process..

[32]  Richard D. Braatz,et al.  On the "Identification and control of dynamical systems using neural networks" , 1997, IEEE Trans. Neural Networks.

[33]  Sepp Hochreiter,et al.  Guessing can Outperform Many Long Time Lag Algorithms , 1996 .

[34]  T. Kailath,et al.  A state-space approach to adaptive RLS filtering , 1994, IEEE Signal Processing Magazine.

[35]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[36]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[37]  C. Lee Giles,et al.  Experimental Comparison of the Effect of Order in Recurrent Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[38]  Ken-ichi Funahashi,et al.  Approximation of dynamical systems by continuous time recurrent neural networks , 1993, Neural Networks.

[39]  José Carlos Príncipe,et al.  The gamma-filter-a new class of adaptive IIR filters with restricted feedback , 1993, IEEE Trans. Signal Process..

[40]  Hava T. Siegelmann,et al.  On the computational power of neural nets , 1992, COLT '92.

[41]  Ronald J. Williams,et al.  Training recurrent networks using the extended Kalman filter , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[42]  Jerome A. Feldman,et al.  Learning automata from ordered examples , 1991, COLT '88.

[43]  C. L. Giles,et al.  Second-order recurrent neural networks for grammatical inference , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[44]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[45]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[46]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[47]  J. Shynk Adaptive IIR filtering , 1989, IEEE ASSP Magazine.

[48]  Michael A. Arbib,et al.  An Introduction to Formal Language Theory , 1988, Texts and Monographs in Computer Science.

[49]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[50]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[51]  B. Anderson,et al.  Optimal Filtering , 1979 .

[52]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[53]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[54]  Dana S. Scott,et al.  Finite Automata and Their Decision Problems , 1959, IBM J. Res. Dev..

[55]  Badong Chen,et al.  Learning Nonlinear Generative Models of Time Series With a Kalman Filter in RKHS , 2014, IEEE Transactions on Signal Processing.

[56]  Badong Chen,et al.  Quantized Kernel Least Mean Square Algorithm , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[57]  Benjamin Schrauwen,et al.  Recurrent Kernel Machines: Computing with Infinite Echo State Networks , 2012, Neural Computation.

[58]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[59]  Beyond—bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[60]  Andrej Dobnikar,et al.  On-line identification and reconstruction of finite automata with generalized recurrent neural networks , 2003, Neural Networks.

[61]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[62]  Herbert Jaeger,et al.  The''echo state''approach to analysing and training recurrent neural networks , 2001 .

[63]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[64]  Donald R. Smith Variational methods in optimization , 1974 .

[65]  M. Minsky Computation: Finite and Infinite Machines , 1967 .

[66]  José Carlos Príncipe,et al.  2011 Ieee International Workshop on Machine Learning for Signal Processing Stochastic Kernel Temporal Difference for Reinforcement Learning , 2022 .