Personalized Federated Learning via Heterogeneous Modular Networks

Personalized Federated Learning (PFL) which collaboratively trains a federated model while considering local clients under privacy constraints has attracted much attention. Despite its popularity, it has been observed that existing PFL approaches result in sub-optimal solutions when the joint distribution among local clients diverges. To address this issue, we present Federated Modular Network (FedMN), a novel PFL approach that adaptively selects sub-modules from a module pool to assemble heterogeneous neural architectures for different clients. FedMN adopts a light-weighted routing hypernetwork to model the joint distribution on each client and produce the personalized selection of the module blocks for each client. To reduce the communication burden in existing FL, we develop an efficient way to interact between the clients and the server. We conduct extensive experiments on the real-world test beds and the results show both effectiveness and efficiency of the proposed FedMN over the baselines.

[1]  Qiang Yang,et al.  Towards Personalized Federated Learning , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Bingsheng He,et al.  Federated Learning on Non-IID Data Silos: An Experimental Study , 2021, 2022 IEEE 38th International Conference on Data Engineering (ICDE).

[3]  K. Ramchandran,et al.  An Efficient Framework for Clustered Federated Learning , 2020, IEEE Transactions on Information Theory.

[4]  Giovanni Neglia,et al.  Federated Multi-Task Learning under a Mixture of Distributions , 2021, NeurIPS.

[5]  Mark Sandler,et al.  Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks , 2021, ArXiv.

[6]  Wei Cheng,et al.  Multi-Task Recurrent Modular Networks , 2021, AAAI.

[7]  Giuseppe Caire,et al.  Coded Caching Over Multicast Routing Networks , 2020, IEEE Transactions on Communications.

[8]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[9]  Wojciech Samek,et al.  Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Venkatesh Saligrama,et al.  Debiasing Model Updates for Improving Personalized Federated Training , 2021, ICML.

[11]  Reza M. Parizi,et al.  Federated Learning: A Survey on Enabling Technologies, Protocols, and Applications , 2020, IEEE Access.

[12]  Nguyen H. Tran,et al.  Personalized Federated Learning with Moreau Envelopes , 2020, NeurIPS.

[13]  Yonina C. Eldar,et al.  The Communication-Aware Clustered Federated Learning Problem , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[14]  Mehrdad Mahdavi,et al.  Adaptive Personalized Federated Learning , 2020, ArXiv.

[15]  Y. Mansour,et al.  Three Approaches for Personalization with Applications to Federated Learning , 2020, ArXiv.

[16]  Aryan Mokhtari,et al.  Personalized Federated Learning: A Meta-Learning Approach , 2020, ArXiv.

[17]  Wei Yang Bryan Lim,et al.  Federated Learning in Mobile Edge Networks: A Comprehensive Survey , 2019, IEEE Communications Surveys & Tutorials.

[18]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[19]  Klaus-Robert Müller,et al.  Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Hubert Eichner,et al.  Federated Evaluation of On-device Personalization , 2019, ArXiv.

[21]  Jakub Konecný,et al.  Improving Federated Learning Personalization via Model Agnostic Meta Learning , 2019, ArXiv.

[22]  Khe Chai Sim,et al.  An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models , 2019, INTERSPEECH.

[23]  Jonas Mueller,et al.  Recognizing Variables from Their Data via Deep Embeddings of Distributions , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[24]  Maria-Florina Balcan,et al.  Adaptive Gradient-Based Meta-Learning Methods , 2019, NeurIPS.

[25]  Ying Wei,et al.  Hierarchically Structured Meta-learning , 2019, ICML.

[26]  Christopher Joseph Pal,et al.  Structure Learning for Neural Module Networks , 2019, EMNLP.

[27]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[28]  David Barber,et al.  Modular Networks: Learning to Decompose Neural Computation , 2018, NeurIPS.

[29]  Ivan Beschastnikh,et al.  Mitigating Sybils in Federated Learning Poisoning , 2018, ArXiv.

[30]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[31]  Bo Zhao,et al.  Modular Generative Adversarial Networks , 2018, ECCV.

[32]  Matthew Riemer,et al.  Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.

[33]  Ameet S. Talwalkar,et al.  Federated Kernelized Multi-Task Learning , 2018 .

[34]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[35]  Gregory Cohen,et al.  EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[36]  Ben Poole,et al.  Categorical Reparametrization with Gumble-Softmax , 2017, ICLR 2017.

[37]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[38]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[39]  Marc Tommasi,et al.  Decentralized Collaborative Learning of Personalized Models over Networks , 2016, AISTATS.

[40]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[41]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[42]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[43]  Joelle Pineau,et al.  Conditional Computation in Neural Networks for faster models , 2015, ArXiv.

[44]  K. Fukumizu,et al.  Kernel Embeddings of Conditional Distributions: A Unified Kernel Framework for Nonparametric Inference in Graphical Models , 2013, IEEE Signal Processing Magazine.

[45]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .