On Noisy Evaluation in Federated Hyperparameter Tuning

Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the effect of noisy evaluation in federated hyperparameter tuning. We first identify and rigorously explore key sources of noise, including client subsampling, data and systems heterogeneity, and data privacy. Surprisingly, our results indicate that even small amounts of noise can significantly impact tuning methods-reducing the performance of state-of-the-art approaches to that of naive baselines. To address noisy evaluation in such scenarios, we propose a simple and effective approach that leverages public proxy data to boost the evaluation signal. Our work establishes general challenges, baselines, and best practices for future work in federated hyperparameter tuning.

[1]  Christos Louizos,et al.  Hyperparameter Optimization through Neural Network Partitioning , 2023, ICLR.

[2]  Zilong Wang,et al.  Dap-FL: Federated Learning Flourishes by Adaptive Tuning and Secure Aggregation , 2022, IEEE Transactions on Parallel and Distributed Systems.

[3]  Ce Zhang,et al.  FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization , 2022, ArXiv.

[4]  Michael G. Rabbat,et al.  Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity , 2022, RecSys.

[5]  Samuel L. Smith,et al.  Unlocking High-Accuracy Differentially Private Image Classification through Scale , 2022, ArXiv.

[6]  Parikshit Ram,et al.  FLoRA: Single-shot Hyper-parameter Optimization for Federated Learning , 2021, ArXiv.

[7]  K. Singhal,et al.  What Do We Mean by Generalization in Federated Learning? , 2021, ICLR.

[8]  Kian Hsiang Low,et al.  Differentially Private Federated Bayesian Optimization with Distributed Exploration , 2021, NeurIPS.

[9]  Daniel Schall,et al.  Evaluation of Hyperparameter-Optimization Approaches in an Industrial Federated Learning System , 2021, ArXiv.

[10]  Huseyin A. Inan,et al.  Differentially Private Fine-tuning of Language Models , 2021, ICLR.

[11]  Tatsunori B. Hashimoto,et al.  Large Language Models Can Be Strong Differentially Private Learners , 2021, ICLR.

[12]  Nicolas Papernot,et al.  Hyperparameter Tuning with Renyi Differential Privacy , 2021, ICLR.

[13]  P. Mohapatra,et al.  FedTune: Automatic Tuning of Federated Learning Hyper-Parameters from System Perspective , 2021, MILCOM 2022 - 2022 IEEE Military Communications Conference (MILCOM).

[14]  M. Alazab,et al.  Genetic CFL: Optimization of Hyper-Parameters in Clustered Federated Learning , 2021, ArXiv.

[15]  Suhas Diggavi,et al.  A Field Guide to Federated Optimization , 2021, ArXiv.

[16]  Virginia Smith,et al.  On Large-Cohort Training for Federated Learning , 2021, NeurIPS.

[17]  Maria-Florina Balcan,et al.  Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing , 2021, NeurIPS.

[18]  Li Zhang,et al.  Oneshot Differentially Private Top-k Selection , 2021, ICML.

[19]  Kian Hsiang Low,et al.  Federated Bayesian Optimization via Thompson Sampling , 2020, NeurIPS.

[20]  Lars Hertel,et al.  Quantity vs. Quality: On Hyperparameter Optimization for Deep Reinforcement Learning , 2020, ArXiv.

[21]  Manzil Zaheer,et al.  Adaptive Federated Optimization , 2020, ICLR.

[22]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[23]  Hesham Mostafa,et al.  Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters , 2019, ArXiv.

[24]  Tzu-Ming Harry Hsu,et al.  Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification , 2019, ArXiv.

[25]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[26]  Tian Li,et al.  Fair Resource Allocation in Federated Learning , 2019, ICLR.

[27]  Andrew Gordon Wilson,et al.  Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning , 2019, UAI.

[28]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[29]  Mehryar Mohri,et al.  Agnostic Federated Learning , 2019, ICML.

[30]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[31]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[32]  Kunal Talwar,et al.  Private selection from private candidates , 2018, STOC.

[33]  Antti Honkela,et al.  Learning rate adaptation for federated and differentially private learning , 2018, 1809.03832.

[34]  Peter I. Frazier,et al.  A Tutorial on Bayesian Optimization , 2018, ArXiv.

[35]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[36]  Max Jaderberg,et al.  Population Based Training of Neural Networks , 2017, ArXiv.

[37]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[38]  Guilherme Ottoni,et al.  Constrained Bayesian Optimization with Noisy Experiments , 2017, Bayesian Analysis.

[39]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[40]  Aaron Klein,et al.  Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets , 2016, AISTATS.

[41]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[42]  Matthias Poloczek,et al.  Multi-Information Source Optimization , 2016, NIPS.

[43]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[44]  Roman Garnett,et al.  Differentially Private Bayesian Optimization , 2015, ICML.

[45]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[46]  Kamalika Chaudhuri,et al.  A Stability-based Validation Procedure for Differentially Private Machine Learning , 2013, NIPS.

[47]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[48]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[49]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[50]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[51]  Warren B. Powell,et al.  A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..

[52]  Daniel R. Jiang,et al.  BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization , 2020, NeurIPS.

[53]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[54]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .