论文信息 - Fed-ensemble: Improving Generalization through Model Ensembling in Federated Learning

Fed-ensemble: Improving Generalization through Model Ensembling in Federated Learning

In this paper we propose Fed-ensemble: a simple approach that brings model ensembling to federated learning (FL). Instead of aggregating local models to update a single global model, Fedensemble uses random permutations to update a group of K models and then obtains predictions through model averaging. Fed-ensemble can be readily utilized within established FL methods and does not impose a computational overhead as it only requires one of the K models to be sent to a client in each communication round. Theoretically, we show that predictions on new data from all K models belong to the same predictive posterior distribution under a neural tangent kernel regime. This result in turn sheds light on the generalization advantages of model averaging. We also illustrate that Fed-ensemble has an elegant Bayesian interpretation. Empirical results show that our model has superior performance over several FL algorithms, on a wide range of data sets, and excels in heterogeneous settings often encountered in FL applications.

[1] Xiaorui Liu,et al. A Double Residual Compression Algorithm for Efficient Distributed Learning , 2019, AISTATS.

[2] Martin J. Wainwright,et al. FedSplit: An algorithmic framework for fast federated optimization , 2020, NeurIPS.

[3] Tengyu Ma,et al. Federated Accelerated Stochastic Gradient Descent , 2020, NeurIPS.

[4] Francisco Herrera,et al. A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[5] Manzil Zaheer,et al. Adaptive Federated Optimization , 2020, ICLR.

[6] Andrew Gordon Wilson,et al. Bayesian Deep Learning and a Probabilistic Perspective of Generalization , 2020, NeurIPS.

[7] Balaji Lakshminarayanan,et al. Deep Ensembles: A Loss Landscape Perspective , 2019, ArXiv.

[8] Hubert Eichner,et al. Federated Evaluation of On-device Personalization , 2019, ArXiv.

[9] Yasaman Khazaeni,et al. Federated Learning with Matched Averaging , 2020, ICLR.

[10] Hubert Eichner,et al. Federated Learning for Mobile Keyboard Prediction , 2018, ArXiv.

[11] Alex Beatson,et al. Amortized Bayesian Meta-Learning , 2018, ICLR.

[12] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[13] Yasaman Khazaeni,et al. Bayesian Nonparametric Federated Learning of Neural Networks , 2019, ICML.

[14] 俊一甘利. 5分で分かる!? 有名論文ナナメ読み：Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .

[15] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.

[17] Andrew Gordon Wilson,et al. Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs , 2018, NeurIPS.

[18] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[19] Jordi Pont-Tuset,et al. The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[20] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.

[21] Sreeram Kannan,et al. Improving Federated Learning Personalization via Model Agnostic Meta Learning , 2019, ArXiv.

[22] Baihe Huang,et al. FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis , 2021, ArXiv.

[23] Venkatesh Saligrama,et al. Federated Learning Based on Dynamic Regularization , 2021, ICLR.

[24] Tianjian Chen,et al. Federated Machine Learning: Concept and Applications , 2019 .

[25] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[26] Arthur Jacot,et al. The asymptotic spectrum of the Hessian of DNN throughout training , 2020, ICLR.

[27] Ameet Talwalkar,et al. Federated Multi-Task Learning , 2017, NIPS.

[28] Wotao Yin,et al. FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data , 2020, ArXiv.

[29] Takayuki Nishio,et al. Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[30] A. G. D. G. Matthews,et al. Sample-then-optimize posterior sampling for Bayesian linear models , 2017 .

[31] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[32] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[33] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Dilin Wang,et al. Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[35] Anit Kumar Sahu,et al. Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[36] Hong-You Chen,et al. FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning , 2020, ICLR.

[37] Hubert Eichner,et al. Towards Federated Learning at Scale: System Design , 2019, SysML.

[38] Mosharaf Chowdhury,et al. FedScale: Benchmarking Model and System Performance of Federated Learning , 2021, ResilientFL.

[39] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40] Peter Richtárik,et al. Distributed Learning with Compressed Gradient Differences , 2019, ArXiv.

[41] Andrey Malinin,et al. Uncertainty in Gradient Boosting via Ensembles , 2021, ICLR.

[42] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).