Federated Learning with Additional Mechanisms on Clients to Reduce Communication Costs

Federated learning (FL) enables on-device training over distributed networks consisting of a massive amount of modern smart devices, such as smartphones and IoT (Internet of Things) devices. However, the leading optimization algorithm in such settings, i.e., federated averaging (FedAvg), suffers from heavy communication costs and the inevitable performance drop, especially when the local data is distributed in a non-IID way. To alleviate this problem, we propose two potential solutions by introducing additional mechanisms to the on-device training. The first (FedMMD) is adopting a two-stream model with the MMD (Maximum Mean Discrepancy) constraint instead of a single model in vanilla FedAvg to be trained on devices. Experiments show that the proposed method outperforms baselines, especially in non-IID FL settings, with a reduction of more than 20% in required communication rounds. The second is FL with feature fusion (FedFusion). By aggregating the features from both the local and global models, we achieve higher accuracy at fewer communication costs. Furthermore, the feature fusion modules offer better initialization for newly incoming clients and thus speed up the process of convergence. Experiments in popular FL scenarios show that our FedFusion outperforms baselines in both accuracy and generalization ability while reducing the number of required communication rounds by more than 60%.

[1]  Fei Chen,et al.  Federated Meta-Learning with Fast Convergence and Efficient Communication , 2018 .

[2]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[3]  Sebastian Caldas,et al.  Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.

[4]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[5]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[6]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[7]  Mehdi Bennis,et al.  Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data , 2018, ArXiv.

[8]  Qingming Huang,et al.  Deep Unsupervised Convolutional Domain Adaptation , 2017, ACM Multimedia.

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[13]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[14]  Lifeng Sun,et al.  Two-Stream Federated Learning: Reduce the Communication Costs , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[15]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[16]  Zhenguo Li,et al.  Federated Meta-Learning for Recommendation , 2018, ArXiv.

[17]  Ananda Theertha Suresh,et al.  Distributed Mean Estimation with Limited Communication , 2016, ICML.

[18]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[19]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[21]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[22]  Lifeng Sun,et al.  Towards Faster and Better Federated Learning: A Feature Fusion Approach , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[23]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.