Can Fair Federated Learning Reduce the need for Personalisation?

Federated Learning (FL) enables training ML models on edge clients without sharing data. However, the federated model's performance on local data varies, disincentivising the participation of clients who benefit little from FL. Fair FL reduces accuracy disparity by focusing on clients with higher losses while personalisation locally fine-tunes the model. Personalisation provides a participation incentive when an FL model underperforms relative to one trained locally. For situations where the federated model provides a lower accuracy than a model trained entirely locally by a client, personalisation improves the accuracy of the pre-trained federated weights to be similar to or exceed those of the local client model. This paper evaluates two Fair FL (FFL) algorithms as starting points for personalisation. Our results show that FFL provides no benefit to relative performance in a language task and may double the number of underperforming clients for an image task. Instead, we propose Personalisation-aware Federated Learning (PaFL) as a paradigm that pre-emptively uses personalisation losses during training. Our technique shows a 50% reduction in the number of underperforming clients for the language task while lowering the number of underperforming clients in the image task instead of doubling it. Thus, evidence indicates that it may allow a broader set of devices to benefit from FL and represents a promising avenue for future experimentation and theoretical analysis.

[1]  Michael W. Mahoney,et al.  A Survey of Quantization Methods for Efficient Neural Network Inference , 2021, Low-Power Computer Vision.

[2]  Rogier C. van Dalen,et al.  Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications , 2021, ArXiv.

[3]  Virginia Smith,et al.  Ditto: Fair and Robust Federated Learning Through Personalization , 2020, ICML.

[4]  Virginia Smith,et al.  Tilted Empirical Risk Minimization , 2020, ICLR.

[5]  Y. Mansour,et al.  Three Approaches for Personalization with Applications to Federated Learning , 2020, ArXiv.

[6]  Vitaly Shmatikov,et al.  Salvaging Federated Learning by Local Adaptation , 2020, ArXiv.

[7]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[8]  H. Vincent Poor,et al.  Federated Learning With Differential Privacy: Algorithms and Performance Analysis , 2019, IEEE Transactions on Information Forensics and Security.

[9]  Hubert Eichner,et al.  Federated Evaluation of On-device Personalization , 2019, ArXiv.

[10]  Phillip B. Gibbons,et al.  The Non-IID Data Quagmire of Decentralized Machine Learning , 2019, ICML.

[11]  Tzu-Ming Harry Hsu,et al.  Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification , 2019, ArXiv.

[12]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[13]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[14]  Tian Li,et al.  Fair Resource Allocation in Federated Learning , 2019, ICLR.

[15]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[16]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[17]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[18]  Kannan Ramchandran,et al.  Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[19]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[20]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[23]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[24]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .