Online Distributed Learning Over Networks in RKH Spaces Using Random Fourier Features

We present a novel diffusion scheme for online kernel-based learning over networks. So far, a major drawback of any online learning algorithm, operating in a reproducing kernel Hilbert space (RKHS), is the need for updating a growing number of parameters as time iterations evolve. Besides complexity, this leads to an increased need of communication resources in a distributed setting. In contrast, we propose to approximate the solution as a fixed-size vector (of larger dimension than the input space) using the previously introduced framework of random Fourier features. This paves the way to use standard linear combine-then-adapt techniques. To the best of our knowledge, this is the first time that a complete protocol for distributed online learning in RKHS is presented. Conditions for asymptotic convergence and boundness of the networkwise regret are also provided. The simulated tests illustrate the performance of the proposed scheme.

[1]  Cédric Richard,et al.  Stochastic Behavior Analysis of the Gaussian Kernel Least-Mean-Square Algorithm , 2012, IEEE Transactions on Signal Processing.

[2]  Sergios Theodoridis,et al.  Machine Learning: A Bayesian and Optimization Perspective , 2015 .

[3]  Rong Jin,et al.  Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison , 2012, NIPS.

[4]  Jie Chen,et al.  Online Dictionary Learning for Kernel LMS , 2014, IEEE Transactions on Signal Processing.

[5]  Sergios Theodoridis,et al.  Adaptive Multiregression in Reproducing Kernel Hilbert Spaces: The Multiaccess MIMO Channel Case , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[7]  Ali H. Sayed,et al.  On distributed online classification in the midst of concept drifts , 2013, Neurocomputing.

[8]  S. Haykin,et al.  Kernel Least‐Mean‐Square Algorithm , 2010 .

[9]  Gonzalo Mateos,et al.  Modeling and Optimization for Big Data Analytics: (Statistical) learning tools for our era of data deluge , 2014, IEEE Signal Processing Magazine.

[10]  Moez Draief,et al.  A diffusion kernel LMS algorithm for nonlinear adaptive networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[12]  Steven A. Orszag,et al.  CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[13]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[14]  Sergios Theodoridis,et al.  Adaptive Learning in a World of Projections , 2011, IEEE Signal Processing Magazine.

[15]  G. Wahba Spline models for observational data , 1990 .

[16]  H. Neudecker,et al.  Block Kronecker products and the vecb operator , 1991 .

[17]  Jie Chen,et al.  Diffusion adaptation over networks with kernel least-mean-square , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[18]  Sergios Theodoridis,et al.  Online Learning in Reproducing Kernel Hilbert Spaces , 2014 .

[19]  Sergios Theodoridis,et al.  Sliding Window Generalized Kernel Affine Projection Algorithm Using Projection Mappings , 2008, EURASIP J. Adv. Signal Process..

[20]  Vijay K. Madisetti,et al.  Digital Signal Processing Fundamentals , 2009 .

[21]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[22]  Weifeng Liu,et al.  Kernel Adaptive Filtering , 2010 .

[23]  AI Koan,et al.  Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.

[24]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[25]  Georgios B. Giannakis,et al.  Consensus-Based Distributed Support Vector Machines , 2010, J. Mach. Learn. Res..

[26]  D. Koenig Digital Signal Processing Fundamentals , 1995 .

[27]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[28]  Narendra Ahuja,et al.  Online learning with kernels: Overcoming the growing sum problem , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[29]  Vimal Bhatia,et al.  The Diffusion-KLMS Algorithm , 2014, 2014 International Conference on Information Technology.

[30]  Paul Honeine,et al.  Online Prediction of Time Series Data With Kernels , 2009, IEEE Transactions on Signal Processing.

[31]  Jeff G. Schneider,et al.  On the Error of Random Fourier Features , 2015, UAI.

[32]  Sergios Theodoridis,et al.  Efficient KLMS and KRLS algorithms: A random fourier feature perspective , 2016, 2016 IEEE Statistical Signal Processing Workshop (SSP).

[33]  Ioannis D. Schizas,et al.  Distributed LMS for Consensus-Based In-Network Adaptive Processing , 2009, IEEE Transactions on Signal Processing.

[34]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[35]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[36]  Ioannis D. Schizas,et al.  Distributed Recursive Least-Squares for Consensus-Based In-Network Adaptive Estimation , 2009, IEEE Transactions on Signal Processing.

[37]  Badong Chen,et al.  Self-organizing kernel adaptive filtering , 2016, EURASIP J. Adv. Signal Process..

[38]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[39]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[40]  Sergios Theodoridis,et al.  Online Kernel-Based Classification Using Adaptive Projection Algorithms , 2008, IEEE Transactions on Signal Processing.

[41]  Ali H. Sayed,et al.  Diffusion Least-Mean Squares Over Adaptive Networks: Formulation and Performance Analysis , 2008, IEEE Transactions on Signal Processing.

[42]  Kostas Berberidis,et al.  Distributed Diffusion-Based LMS for Node-Specific Adaptive Parameter Estimation , 2014, IEEE Transactions on Signal Processing.

[43]  Soummya Kar,et al.  Gossip Algorithms for Distributed Signal Processing , 2010, Proceedings of the IEEE.

[44]  Ali H. Sayed,et al.  Diffusion Adaptation over Networks , 2012, ArXiv.

[45]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[46]  Sergios Theodoridis,et al.  Ieee Transactions on Signal Processing Extension of Wirtinger's Calculus to Reproducing Kernel Hilbert Spaces and the Complex Kernel Lms , 2022 .

[47]  Ali H. Sayed,et al.  Diffusion LMS Strategies for Distributed Estimation , 2010, IEEE Transactions on Signal Processing.

[48]  Ignacio Santamaría,et al.  A Sliding-Window Kernel RLS Algorithm and Its Application to Nonlinear Channel Identification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[49]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[50]  Badong Chen,et al.  Quantized Kernel Least Mean Square Algorithm , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Sergios Theodoridis,et al.  Adaptive Robust Distributed Learning in Diffusion Sensor Networks , 2011, IEEE Transactions on Signal Processing.

[52]  Isao Yamada,et al.  An Adaptive Projected Subgradient Approach to Learning in Diffusion Networks , 2009, IEEE Transactions on Signal Processing.

[53]  Jie Chen,et al.  Multitask Diffusion Adaptation Over Networks , 2013, IEEE Transactions on Signal Processing.

[54]  Markus Rupp,et al.  Convergence Issues in the LMS Adaptive Filter , 2009 .

[55]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).