FedHM: Efficient Federated Learning for Heterogeneous Models via Low-rank Factorization

The underlying assumption of recent federated learning (FL) paradigms is that local models usually share the same network architecture as the global model, which becomes impractical for mobile and IoT devices with different setups of hardware and infrastructure. A scalable federated learning framework should address heterogeneous clients equipped with different computation and communication capabilities. To this end, this paper proposes FEDHM, a novel federated model compression framework that distributes the heterogeneous low-rank models to clients and then aggregates them into a global full-rank model. Our solution enables the training of heterogeneous local models with varying computational complexities and aggregates a single global model. Furthermore, FEDHM not only reduces the computational complexity of the device, but also reduces the communication cost by using low-rank models. Extensive experimental results demonstrate that our proposed FEDHM outperforms the current pruning-based FL approaches in terms of test Top-1 accuracy (4.6% accuracy gain on average), with smaller model size (1.5× smaller on average) under various heterogeneous FL settings.

[1]  Pheng-Ann Heng,et al.  FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[3]  Jiayu Zhou,et al.  Data-Free Knowledge Distillation for Heterogeneous Federated Learning , 2021, ICML.

[4]  Mikhail Khodak,et al.  Initialization and Regularization of Factorized Neural Layers , 2021, ICLR.

[5]  Yasaman Khazaeni,et al.  Federated Learning with Matched Averaging , 2020, ICLR.

[6]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[7]  Torsten Hoefler,et al.  Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. , 2018 .

[8]  Hossein Mobahi,et al.  Self-Distillation Amplifies Regularization in Hilbert Space , 2020, NeurIPS.

[9]  Jie Ding,et al.  HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients , 2020, ICLR.

[10]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[11]  Venkatesh Saligrama,et al.  Federated Learning Based on Dynamic Regularization , 2021, ICLR.

[12]  Sebastian U. Stich,et al.  Ensemble Distillation for Robust Model Fusion in Federated Learning , 2020, NeurIPS.

[13]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[14]  Dimitris Papailiopoulos,et al.  Pufferfish: Communication-efficient Models At No Extra Cost , 2021, MLSys.

[15]  Vladimir Braverman,et al.  FetchSGD: Communication-Efficient Federated Learning with Sketching , 2020, ICML.

[16]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[17]  Junpu Wang,et al.  FedMD: Heterogenous Federated Learning via Model Distillation , 2019, ArXiv.

[18]  Zhihui Li,et al.  Dynamic Slimmable Network , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Thomas S. Huang,et al.  Universally Slimmable Networks and Improved Training Techniques , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[21]  Manzil Zaheer,et al.  Adaptive Federated Optimization , 2020, ICLR.

[22]  Xianghao Yu,et al.  Communication-Efficient Federated Learning with Dual-Side Low-Rank Compression , 2021, ArXiv.

[23]  Laurent Condat,et al.  Optimal Gradient Compression for Distributed and Federated Learning , 2020, ArXiv.

[24]  Ning Xu,et al.  Slimmable Neural Networks , 2018, ICLR.

[25]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[26]  Murali Annavaram,et al.  Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge , 2020, NeurIPS.

[27]  Harsha V. Madhyastha,et al.  Oort: Informed Participant Selection for Scalable Federated Learning , 2020, ArXiv.

[28]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[29]  Tzu-Ming Harry Hsu,et al.  Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification , 2019, ArXiv.

[30]  Ruslan Salakhutdinov,et al.  Think Locally, Act Globally: Federated Learning with Local and Global Representations , 2020, ArXiv.

[31]  Peter Kilpatrick,et al.  FedAdapt: Adaptive Offloading for IoT Devices in Federated Learning , 2021, IEEE Internet of Things Journal.

[32]  Suhas Diggavi,et al.  A Field Guide to Federated Optimization , 2021, ArXiv.

[33]  Qinghua Liu,et al.  Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization , 2020, NeurIPS.

[34]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[35]  Ivan V. Oseledets,et al.  Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.

[36]  Yifan Gong,et al.  Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.