Towards Federated Learning on Time-Evolving Heterogeneous Data

Federated Learning (FL) is an emerging learning paradigm that preserves privacy by ensuring client data locality on edge devices. The optimization of FL is challenging in practice due to the diversity and heterogeneity of the learning system. Despite recent research efforts on improving the optimization of heterogeneous data, the impact of time-evolving heterogeneous data in real-world scenarios, such as changing client data or intermittent clients joining or leaving during training, has not been well studied. In this work, we propose Continual Federated Learning (CFL), a flexible framework, to capture the time-evolving heterogeneity of FL. CFL covers complex and realistic scenarios—which are challenging to evaluate in previous FL formulations—by extracting the information of past local datasets and approximating the local objective functions. Theoretically, we demonstrate that CFL methods achieve a faster convergence rate than FedAvg in time-evolving scenarios, with the benefit being dependent on approximation quality. In a series of experiments, we show that the numerical findings match the convergence analysis, and CFL methods significantly outperform the other SOTA FL baselines.

[1]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[2]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[3]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[4]  Chung-Kil Hur,et al.  R2: An Efficient MCMC Sampler for Probabilistic Programs , 2014, AAAI.

[5]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[7]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[8]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[9]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[10]  Richard E. Turner,et al.  Partitioned Variational Inference: A unified framework encompassing federated and continual learning , 2018, ArXiv.

[11]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Davide Maltoni,et al.  Continuous Learning in Single-Incremental-Task Scenarios , 2018, Neural Networks.

[13]  Nadav Israel,et al.  Overcoming Forgetting in Federated Learning on Non-IID Data , 2019, ArXiv.

[14]  Mehryar Mohri,et al.  Agnostic Federated Learning , 2019, ICML.

[15]  Tzu-Ming Harry Hsu,et al.  Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification , 2019, ArXiv.

[16]  Yasaman Khazaeni,et al.  Bayesian Nonparametric Federated Learning of Neural Networks , 2019, ICML.

[17]  Sebastian U. Stich,et al.  Local SGD Converges Fast and Communicates Little , 2018, ICLR.

[18]  Aymeric Dieuleveut,et al.  Communication trade-offs for synchronized distributed SGD with large step size , 2019, NeurIPS 2019.

[19]  Peter Richtárik,et al.  SGD: General Analysis and Improved Rates , 2019, ICML 2019.

[20]  Guodong Zhang,et al.  Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model , 2019, NeurIPS.

[21]  Sebastian U. Stich,et al.  Ensemble Distillation for Robust Model Fusion in Federated Learning , 2020, NeurIPS.

[22]  Mehrdad Farajtabar,et al.  SOLA: Continual Learning with Second-Order Loss Approximation , 2020, ArXiv.

[23]  Michael W. Mahoney,et al.  PyHessian: Neural Networks Through the Lens of the Hessian , 2019, 2020 IEEE International Conference on Big Data (Big Data).

[24]  Ali Jadbabaie,et al.  Robust Federated Learning: The Case of Affine Distribution Shifts , 2020, NeurIPS.

[25]  Sashank J. Reddi,et al.  Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning , 2020, ArXiv.

[26]  Qinghua Liu,et al.  Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization , 2020, NeurIPS.

[27]  Peter Richtárik,et al.  Tighter Theory for Local SGD on Identical and Heterogeneous Data , 2019, AISTATS.

[28]  Ohad Shamir,et al.  Is Local SGD Better than Minibatch SGD? , 2020, ICML.

[29]  Nathan Srebro,et al.  Minibatch vs Local SGD for Heterogeneous Distributed Learning , 2020, NeurIPS.

[30]  Peter Richtárik,et al.  Federated Learning of a Mixture of Global and Local Models , 2020, ArXiv.

[31]  Mehrdad Farajtabar,et al.  Optimization and Generalization of Regularization-Based Continual Learning: a Loss Approximation Viewpoint. , 2020 .

[32]  Nguyen H. Tran,et al.  Personalized Federated Learning with Moreau Envelopes , 2020, NeurIPS.

[33]  Tian Li,et al.  Fair Resource Allocation in Federated Learning , 2019, ICLR.

[34]  Tao Lin,et al.  Don't Use Large Mini-Batches, Use Local SGD , 2018, ICLR.

[35]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[36]  Sai Praneeth Karimireddy The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Updates , 2020 .

[37]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[38]  Martin Jaggi,et al.  A Unified Theory of Decentralized SGD with Changing Topology and Local Updates , 2020, ICML.

[39]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[40]  Sujay Sanghavi,et al.  Faster non-convex federated learning via global and local momentum , 2020, UAI.

[41]  Jie Ding,et al.  HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients , 2020, ICLR.

[42]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[43]  Eunho Yang,et al.  Federated Continual Learning with Weighted Inter-client Transfer , 2020, ICML.

[44]  Aryan Mokhtari,et al.  Federated Learning with Compression: Unified Analysis and Sharp Guarantees , 2020, AISTATS.

[45]  Virginia Smith,et al.  Ditto: Fair and Robust Federated Learning Through Personalization , 2020, ICML.

[46]  Aryan Mokhtari,et al.  Exploiting Shared Representations for Personalized Federated Learning , 2021, ICML.

[47]  Manzil Zaheer,et al.  Adaptive Federated Optimization , 2020, ICLR.

[48]  Hong-You Chen,et al.  On Bridging Generic and Personalized Federated Learning , 2021, ArXiv.

[49]  Jian Liang,et al.  No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data , 2021, NeurIPS.

[50]  George J. Pappas,et al.  Achieving Linear Convergence in Federated Learning under Objective and Systems Heterogeneity , 2021, ArXiv.

[51]  Roberto Iglesias,et al.  Concept drift detection and adaptation for federated and continual learning , 2021, Multimedia Tools and Applications.