Towards Faster and Better Federated Learning: A Feature Fusion Approach

Federated learning enables on-device training over distributed networks consisting of a massive amount of modern smart devices, such as smartphones and IoT devices. However, the leading optimization algorithm in such settings, i.e., federated averaging, suffers from heavy communication cost and inevitable performance drop, especially when the local data is distributed in a Non-IID way. In this paper, we propose a feature fusion method to address this problem. By aggregating the features from both the local and global models, we achieve a higher accuracy at less communication cost. Furthermore, the feature fusion modules offer better initialization for newly incoming clients and thus speed up the process of convergence. Experiments in popular federated learning scenarios show that our federated learning algorithm with feature fusion mechanism outperforms baselines in both accuracy and generalization ability while reducing the number of communication rounds by more than 60%.

[1]  Zhenguo Li,et al.  Federated Meta-Learning for Recommendation , 2018, ArXiv.

[2]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[3]  Mehdi Bennis,et al.  Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data , 2018, ArXiv.

[4]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[5]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[6]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[7]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[8]  Lifeng Sun,et al.  Two-Stream Federated Learning: Reduce the Communication Costs , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[9]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[10]  Ananda Theertha Suresh,et al.  Distributed Mean Estimation with Limited Communication , 2016, ICML.

[11]  Sebastian Caldas,et al.  Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.

[12]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[15]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.