FLoRA: Single-shot Hyper-parameter Optimization for Federated Learning

We address the relatively unexplored problem of hyper-parameter optimization (HPO) for federated learning (FL-HPO). We introduce Federated Loss SuRface Aggregation (FLoRA), the first FL-HPO solution framework that can address use cases of tabular data and gradient boosting training algorithms in addition to stochastic gradient descent/neural networks commonly addressed in the FL literature. The framework enables single-shot FL-HPO, by first identifying a good set of hyper-parameters that are used in a single FL training. Thus, it enables FLHPO solutions with minimal additional communication overhead compared to FL training without HPO. Our empirical evaluation of FLoRA for Gradient Boosted Decision Trees on seven OpenML data sets demonstrates significant model accuracy improvements over the considered baseline, and robustness to increasing number of parties involved in FL-HPO training.

[1]  Antti Honkela,et al.  Learning rate adaptation for federated and differentially private learning , 2018, 1809.03832.

[2]  Heiko Ludwig,et al.  Mitigating Bias in Federated Learning , 2020, ArXiv.

[3]  Aryan Mokhtari,et al.  Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach , 2020, NeurIPS.

[4]  Parikshit Ram,et al.  An ADMM Based Framework for AutoML Pipeline Configuration , 2020, AAAI.

[5]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[6]  Virginia Smith,et al.  Ditto: Fair and Robust Federated Learning Through Personalization , 2020, ICML.

[7]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[8]  Lars Kotthoff,et al.  Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..

[9]  Heiko Ludwig,et al.  Adaptive Histogram-Based Gradient Boosted Trees for Federated Learning , 2020, ArXiv.

[10]  Lars Schmidt-Thieme,et al.  Learning Data Set Similarities for Hyperparameter Optimization Initializations , 2015, MetaSel@PKDD/ECML.

[11]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[12]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[13]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[14]  Luís Torgo,et al.  OpenML: networked science in machine learning , 2014, SKDD.

[15]  Mehryar Mohri,et al.  Agnostic Federated Learning , 2019, ICML.

[16]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[17]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[18]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning , 2019, ArXiv.

[19]  Runhua Xu,et al.  HybridAlpha: An Efficient Approach for Privacy-Preserving Federated Learning , 2019, AISec@CCS.

[20]  Lars Schmidt-Thieme,et al.  Hyperparameter Search Space Pruning - A New Component for Sequential Model-Based Hyperparameter Optimization , 2015, ECML/PKDD.

[21]  Hesham Mostafa,et al.  Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters , 2019, ArXiv.

[22]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[23]  Heiko Ludwig,et al.  TiFL: A Tier-based Federated Learning System , 2020, HPDC.

[24]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[25]  Gerald Tesauro,et al.  Selecting Near-Optimal Learners via Incremental Data Allocation , 2015, AAAI.

[26]  Rui Zhang,et al.  A Hybrid Approach to Privacy-Preserving Federated Learning , 2018, Informatik Spektrum.

[27]  Alberto Costa,et al.  RBFOpt: an open-source library for black-box optimization with costly function evaluations , 2018, Mathematical Programming Computation.

[28]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[29]  Kian Hsiang Low,et al.  Federated Bayesian Optimization via Thompson Sampling , 2020, NeurIPS.

[30]  Lars Schmidt-Thieme,et al.  Scalable Gaussian process-based transfer surrogates for hyperparameter optimization , 2017, Machine Learning.

[31]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[32]  Joaquin Vanschoren,et al.  Meta-Learning: A Survey , 2018, Automated Machine Learning.

[33]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[34]  Lars Schmidt-Thieme,et al.  Learning hyperparameter optimization initializations , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[35]  Manzil Zaheer,et al.  Adaptive Federated Optimization , 2020, ICLR.

[36]  Matthias Seeger,et al.  Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning , 2019, NeurIPS.

[37]  Ameet Talwalkar,et al.  Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.

[38]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[39]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[40]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[41]  Kevin Leyton-Brown,et al.  Auto-WEKA: Automated Selection and Hyper-Parameter Optimization of Classification Algorithms , 2012, ArXiv.

[42]  Marius Lindauer,et al.  Auto-Sklearn 2.0: The Next Generation , 2020, ArXiv.

[43]  Maria-Florina Balcan,et al.  Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing , 2021, NeurIPS.

[44]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[45]  Heiko Ludwig,et al.  IBM Federated Learning: an Enterprise Framework White Paper V0.1 , 2020, ArXiv.

[46]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[47]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[48]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[49]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.