Personalized Cross-Silo Federated Learning on Non-IID Data

Non-IID data present a tough challenge for federated learning. In this paper, we explore a novel idea of facilitating pairwise collaborations between clients with similar data. We propose FedAMP, a new method employing federated attentive message passing to facilitate similar clients to collaborate more. We establish the convergence of FedAMP for both convex and non-convex models, and propose a heuristic method to further improve the performance of FedAMP when clients adopt deep neural networks as personalized models. Our extensive experiments on benchmark data sets demonstrate the superior performance of the proposed methods.

[1]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[2]  Zi Huang,et al.  Learning Private Neural Language Modeling with Attentive Aggregation , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[3]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[4]  Y. Mansour,et al.  Three Approaches for Personalization with Applications to Federated Learning , 2020, ArXiv.

[5]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[6]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[7]  Anthony Man-Cho So,et al.  Incremental Methods for Weakly Convex Optimization , 2019, ArXiv.

[8]  Alexander Shapiro,et al.  Stochastic Approximation approach to Stochastic Programming , 2013 .

[9]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[10]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[11]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[12]  Yasaman Khazaeni,et al.  Bayesian Nonparametric Federated Learning of Neural Networks , 2019, ICML.

[13]  Mehryar Mohri,et al.  Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Sreeram Kannan,et al.  Improving Federated Learning Personalization via Model Agnostic Meta Learning , 2019, ArXiv.

[16]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning , 2019, ArXiv.

[17]  Milind Kulkarni,et al.  Survey of Personalization Techniques for Federated Learning , 2020, 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4).

[18]  Tianjian Chen,et al.  Federated Machine Learning: Concept and Applications , 2019 .

[19]  Werner Zellinger,et al.  Moment-Based Domain Adaptation: Learning Bounds and Algorithms , 2020, ArXiv.

[20]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[21]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[22]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[23]  Mehrdad Mahdavi,et al.  Adaptive Personalized Federated Learning , 2020, ArXiv.

[24]  Eamonn J. Keogh,et al.  Curse of Dimensionality , 2010, Encyclopedia of Machine Learning.

[25]  Sashank J. Reddi,et al.  Why ADAM Beats SGD for Attention Models , 2019, ArXiv.

[26]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[27]  Michel Verleysen,et al.  The Curse of Dimensionality in Data Mining and Time Series Prediction , 2005, IWANN.

[28]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[29]  Zhenguo Li,et al.  Federated Meta-Learning for Recommendation , 2018, ArXiv.

[30]  Hubert Eichner,et al.  Federated Evaluation of On-device Personalization , 2019, ArXiv.

[31]  Yasaman Khazaeni,et al.  Federated Learning with Matched Averaging , 2020, ICLR.

[32]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[33]  Johannes Schneider,et al.  Mass Personalization of Deep Learning , 2019, ArXiv.

[34]  Gregory Cohen,et al.  EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[35]  Aryan Mokhtari,et al.  Personalized Federated Learning: A Meta-Learning Approach , 2020, ArXiv.

[36]  Peter Richtárik,et al.  Federated Learning of a Mixture of Global and Local Models , 2020, ArXiv.

[37]  Maria-Florina Balcan,et al.  Adaptive Gradient-Based Meta-Learning Methods , 2019, NeurIPS.

[38]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[39]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[40]  Dimitri P. Bertsekas,et al.  Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey , 2015, ArXiv.

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Wotao Yin,et al.  FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data , 2020, ArXiv.