Asynchronous Online Federated Learning for Edge Devices

Federated learning (FL) is a machine learning paradigm where a shared central model is learned across multiple distributed client devices while the training data remains on edge devices or local clients. Most prior work on federated learning uses Federated Averaging (FedAvg) as an optimization method for training in a synchronized fashion. This involves independent training at multiple edge devices with synchronous aggregation steps. However, the assumptions made by FedAvg are not realistic given the heterogeneity of devices. In particular, the volume and distribution of collected data vary in the training process due to different sampling rates of edge devices. The edge devices themselves also vary in their available communication bandwidth and system configurations, such as memory, processor speed, and power requirements. This leads to vastly different training times as well as model/data transfer times. Furthermore, availability issues at edge devices can lead to a lack of contribution from specific edge devices to the federated model. In this paper, we present an Asynchronous Online Federated Learning (ASO- fed) framework, where the edge devices perform online learning with continuous streaming local data and a central server aggregates model parameters from local clients. Our framework updates the central model in an asynchronous manner to tackle the challenges associated with both varying computational loads at heterogeneous edge devices and edge devices that lag behind or dropout. Experiments on three real-world datasets show the effectiveness of ASO-fed on lowering the overall training cost and maintaining good prediction performance.

[1]  Heiko Ludwig,et al.  TiFL: A Tier-based Federated Learning System , 2020, HPDC.

[2]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[3]  Claudio Gentile,et al.  Linear Algorithms for Online Multitask Classification , 2010, COLT.

[4]  Billy M. Williams,et al.  Urban Freeway Traffic Flow Prediction: Application of Seasonal Autoregressive Integrated Moving Average and Exponential Smoothing Models , 1998 .

[5]  Kin K. Leung,et al.  When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[6]  Yoshua Bengio,et al.  Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism , 2016, NAACL.

[7]  Indranil Gupta,et al.  Asynchronous Federated Optimization , 2019, ArXiv.

[8]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[9]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[10]  Alexander J. Smola,et al.  Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Nadir Weibel,et al.  ExtraSensory App: Data Collection In-the-Wild with Rich User Interface to Self-Report Behavior , 2018, CHI.

[13]  Jiayu Zhou,et al.  Asynchronous Multi-task Learning , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[14]  Richard Cole,et al.  Amortized Analysis on Asynchronous Gradient Descent , 2014, ArXiv.

[15]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[16]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[17]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[18]  Kuang-Ching Wang,et al.  The Design and Operation of CloudLab , 2019, USENIX ATC.

[19]  Pascal Bianchi,et al.  Asynchronous distributed optimization using a randomized alternating direction method of multipliers , 2013, 52nd IEEE Conference on Decision and Control.

[20]  Yiming Yang,et al.  Adaptive Smoothed Online Multi-Task Learning , 2016, NIPS.

[21]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[22]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[23]  Fuzhen Zhuang,et al.  Collaborating between Local and Global Learning for Distributed Online Multiple Tasks , 2015, CIKM.

[24]  Yi Zhou,et al.  Towards Taming the Resource and Data Heterogeneity in Federated Learning , 2019, OpML.

[25]  Hubert Eichner,et al.  Federated Learning for Mobile Keyboard Prediction , 2018, ArXiv.

[26]  Ying-Ming Wang,et al.  On the normalization of interval and fuzzy weights , 2006, Fuzzy Sets Syst..

[27]  Giuseppe De Nicolao,et al.  Client–Server Multitask Learning From Distributed Datasets , 2008, IEEE Transactions on Neural Networks.

[28]  Peter L. Bartlett,et al.  Matrix regularization techniques for online multitask learning , 2008 .

[29]  Zi Wang,et al.  An Asynchronous Distributed Proximal Gradient Method for Composite Convex Optimization , 2014, ICML.

[30]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[31]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[32]  Peter Richtárik,et al.  Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.

[33]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[34]  Julian J. McAuley,et al.  Modeling Heart Rate and Activity Data for Personalized Fitness Recommendation , 2019, WWW.

[35]  David A. McAllester,et al.  Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence , 2009, UAI 2009.

[36]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[37]  Philip M. Long,et al.  Online Multitask Learning , 2006, COLT.

[38]  Xu Sun,et al.  Large-Scale Personalized Human Activity Recognition Using Online Multitask Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[39]  Quoc V. Le,et al.  QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[40]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[41]  Kin K. Leung,et al.  Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.

[42]  Ohad Shamir,et al.  Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..

[43]  Sarvar Patel,et al.  Practical Secure Aggregation for Federated Learning on User-Held Data , 2016, ArXiv.

[44]  Michael R. Lyu,et al.  Online learning for multi-task feature selection , 2010, CIKM '10.

[45]  Steven C. H. Hoi,et al.  Exact Soft Confidence-Weighted Learning , 2012, ICML.