FedMD: Heterogenous Federated Learning via Model Distillation

Federated learning enables the creation of a powerful centralized model without compromising data privacy of multiple participants. While successful, it does not incorporate the case where each participant independently designs its own model. Due to intellectual property concerns and heterogeneous nature of tasks and data, this is a widespread requirement in applications of federated learning to areas such as health care and AI as a service. In this work, we use transfer learning and knowledge distillation to develop a universal framework that enables federated learning when each agent owns not only their private data, but also uniquely designed models. We test our framework on the MNIST/FEMNIST dataset and the CIFAR10/CIFAR100 dataset and observe fast improvement across all participating models. With 10 distinct participants, the final test accuracy of each model on average receives a 20% gain on top of what's possible without collaboration and is only a few percent lower than the performance each model would have obtained if all private datasets were pooled and made directly available for all participants.

[1]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[2]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[3]  Ying-Chang Liang,et al.  Incentive Design for Efficient Federated Learning in Mobile Networks: A Contract Theory Approach , 2019, 2019 IEEE VTS Asia Pacific Wireless Communications Symposium (APWCS).

[4]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[5]  Joachim M. Buhmann,et al.  Variational Federated Multi-Task Learning , 2019, ArXiv.

[6]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[7]  Fei Chen,et al.  Federated Meta-Learning with Fast Convergence and Efficient Communication , 2018 .

[8]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[9]  Zhenguo Li,et al.  Federated Meta-Learning for Recommendation , 2018, ArXiv.

[10]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[11]  Maria-Florina Balcan,et al.  Adaptive Gradient-Based Meta-Learning Methods , 2019, NeurIPS.

[12]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[13]  Nathan Srebro,et al.  Semi-Cyclic Stochastic Gradient Descent , 2019, ICML.

[14]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[15]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[16]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[17]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.