Multikernel Adaptive Filtering

This paper exemplifies that the use of multiple kernels leads to efficient adaptive filtering for nonlinear systems. Two types of multikernel adaptive filtering algorithms are proposed. One is a simple generalization of the kernel normalized least mean square (KNLMS) algorithm [2], adopting a coherence criterion for dictionary designing. The other is derived by applying the adaptive proximal forward-backward splitting method to a certain squared distance function plus a weighted block l1 norm penalty, encouraging the sparsity of an adaptive filter at the block level for efficiency. The proposed multikernel approach enjoys a higher degree of freedom than those approaches which design a kernel as a convex combination of multiple kernels. Numerical examples show that the proposed approach achieves significant gains particularly for nonstationary data as well as insensitivity to the choice of some design-parameters.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[3]  Isao Yamada,et al.  Adaptive Parallel Quadratic-Metric Projection Algorithms , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[5]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[6]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[7]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[8]  Babak Hassibi,et al.  On the Reconstruction of Block-Sparse Signals With an Optimal Number of Measurements , 2008, IEEE Transactions on Signal Processing.

[9]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[10]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[11]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[12]  Paul Honeine,et al.  Online Prediction of Time Series Data With Kernels , 2009, IEEE Transactions on Signal Processing.

[13]  Kazuhiko Ozeki,et al.  An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its properties , 1984 .

[14]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[15]  Masahiro Yukawa,et al.  An efficient kernel adaptive filtering algorithm using hyperplane projection along affine subspace , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[16]  Masahiro Yukawa Nonlinear adaptive filtering techniques with multiple kernels , 2011, 2011 19th European Signal Processing Conference.

[17]  José Antonio Apolinário,et al.  Constrained adaptation algorithms employing Householder transformation , 2002, IEEE Trans. Signal Process..

[18]  Jinbo Bi,et al.  Column-generation boosting methods for mixture of kernels , 2004, KDD.

[19]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[20]  Sergios Theodoridis,et al.  Online Kernel-Based Classification Using Adaptive Projection Algorithms , 2008, IEEE Transactions on Signal Processing.

[21]  D.T.M. Stock The block underdetermined covariance (BUC) fast transversal filter (FTF) algorithm for adaptive filtering , 1992, [1992] Conference Record of the Twenty-Sixth Asilomar Conference on Signals, Systems & Computers.

[22]  I. Yamada,et al.  Pairwise Optimal Weight Realization—Acceleration Technique for Set-Theoretic Adaptive Parallel Subgradient Projection Algorithm , 2006, IEEE Transactions on Signal Processing.

[23]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[24]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[25]  Shirish Nagaraj,et al.  Set-membership filtering and a set-membership normalized LMS algorithm with an adaptive step size , 1998, IEEE Signal Processing Letters.

[26]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[27]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[28]  Klaus-Robert Müller,et al.  Incremental Support Vector Learning: Analysis, Implementation and Applications , 2006, J. Mach. Learn. Res..

[29]  Weifeng Liu,et al.  Kernel Adaptive Filtering , 2010 .

[30]  Ogawa Hidemitsu Sampling Theory and Principle of Science , 2011 .

[31]  Weifeng Liu,et al.  Kernel Affine Projection Algorithms , 2008, EURASIP J. Adv. Signal Process..

[32]  L. Glass,et al.  Oscillation and chaos in physiological control systems. , 1977, Science.

[33]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[34]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[35]  Isao Yamada,et al.  A sparse adaptive filtering using time-varying soft-thresholding techniques , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Stefan Werner,et al.  Set-membership binormalized data-reusing LMS algorithms , 2003, IEEE Trans. Signal Process..

[37]  Olivier Bousquet,et al.  On the Complexity of Learning the Kernel Matrix , 2002, NIPS.

[38]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[39]  Masahiro Yukawa,et al.  Nonlinear channel equalization by multi-kernel adaptive filter , 2012, 2012 IEEE 13th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[40]  Masahiro Yukawa,et al.  Krylov-Proportionate Adaptive Filtering Techniques Not Limited to Sparse Systems , 2009, IEEE Transactions on Signal Processing.

[41]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[42]  T. Hinamoto,et al.  Extended theory of learning identification , 1975 .

[43]  Sergios Theodoridis,et al.  Adaptive Constrained Learning in Reproducing Kernel Hilbert Spaces: The Robust Beamforming Case , 2009, IEEE Transactions on Signal Processing.

[44]  Isao Yamada,et al.  An efficient robust adaptive filtering algorithm based on parallel subgradient projection techniques , 2002, IEEE Trans. Signal Process..

[45]  Kristin P. Bennett,et al.  MARK: a boosting algorithm for heterogeneous kernel models , 2002, KDD.

[46]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[47]  Yih-Fang Huang,et al.  Kernelized set-membership approach to nonlinear adaptive filtering , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[48]  Charles A. Micchelli,et al.  Learning the Kernel Function via Regularization , 2005, J. Mach. Learn. Res..

[49]  Stephen P. Boyd,et al.  Optimal kernel selection in Kernel Fisher discriminant analysis , 2006, ICML.

[50]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[51]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[52]  Daphna Weinshall,et al.  Learning a kernel function for classification with small training samples , 2006, ICML.