Probabilistic Predictions with Federated Learning

Probabilistic predictions with machine learning are important in many applications. These are commonly done with Bayesian learning algorithms. However, Bayesian learning methods are computationally expensive in comparison with non-Bayesian methods. Furthermore, the data used to train these algorithms are often distributed over a large group of end devices. Federated learning can be applied in this setting in a communication-efficient and privacy-preserving manner but does not include predictive uncertainty. To represent predictive uncertainty in federated learning, our suggestion is to introduce uncertainty in the aggregation step of the algorithm by treating the set of local weights as a posterior distribution for the weights of the global model. We compare our approach to state-of-the-art Bayesian and non-Bayesian probabilistic learning algorithms. By applying proper scoring rules to evaluate the predictive distributions, we show that our approach can achieve similar performance as the benchmark would achieve in a non-distributed setting.

[1]  Peng Wang,et al.  Bayesian Neural Networks Uncertainty Quantification with Cubature Rules , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[2]  Sebastian Lerch,et al.  Combining predictive distributions for the statistical post-processing of ensemble forecasts , 2016, International Journal of Forecasting.

[3]  Frank Gauterin,et al.  A Stochastic Range Estimation Algorithm for Electric Vehicles Using Traffic Phase Classification , 2019, IEEE Transactions on Vehicular Technology.

[4]  Xiaoning Zhang,et al.  Probabilistic Solar Irradiation Forecasting Based on Variational Bayesian Inference With Secure Federated Learning , 2021, IEEE Transactions on Industrial Informatics.

[5]  Zhi Zhou,et al.  A Nonparametric Bayesian Framework for Short-Term Wind Power Probabilistic Forecast , 2019, IEEE Transactions on Power Systems.

[6]  Tim N. Palmer,et al.  Ensemble forecasting , 2008, J. Comput. Phys..

[7]  Charles J. Geyer,et al.  Introduction to Markov Chain Monte Carlo , 2011 .

[8]  Alexander Jordan,et al.  Evaluating Probabilistic Forecasts with scoringRules , 2017, Journal of Statistical Software.

[9]  A Survey on Bayesian Deep Learning , 2020, ACM Comput. Surv..

[10]  Rachid Guerraoui,et al.  AGGREGATHOR: Byzantine Machine Learning via Robust Gradient Aggregation , 2019, SysML.

[11]  Sebastian Caldas,et al.  LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[12]  Daniel S. Wilks,et al.  Smoothing forecast ensembles with fitted probability distributions , 2002 .

[13]  István Hegedüs,et al.  Efficient P2P Ensemble Learning with Linear Models on Fully Distributed Data , 2011, ArXiv.

[14]  Antti Honkela,et al.  Differentially Private Federated Variational Inference , 2019, ArXiv.

[15]  Frank Gauterin,et al.  Stochastic Forecasting of Vehicle Dynamics Using Sequential Monte Carlo Simulation , 2017, IEEE Transactions on Intelligent Vehicles.

[16]  Dmitry Vetrov,et al.  Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning , 2020, ICLR.

[17]  Alex Graves,et al.  Practical Variational Inference for Neural Networks , 2011, NIPS.

[18]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[19]  Yann LeCun,et al.  The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[20]  Richard E. Turner,et al.  Partitioned Variational Inference: A unified framework encompassing federated and continual learning , 2018, ArXiv.

[21]  Andrew Gordon Wilson,et al.  A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.

[22]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[23]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[24]  Muhammad Usman Asad,et al.  FedOpt: Towards Communication Efficiency and Privacy Preservation in Federated Learning , 2020, Applied Sciences.

[25]  Peng Xiao,et al.  Averaging Is Probably Not the Optimum Way of Aggregating Parameters in Federated Learning , 2020, Entropy.

[26]  Andrew Gordon Wilson,et al.  Bayesian Deep Learning and a Probabilistic Perspective of Generalization , 2020, NeurIPS.

[27]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[28]  Nicolai Schipper Jespersen,et al.  An Introduction to Markov Chain Monte Carlo , 2010 .

[29]  Zoran Kapelan,et al.  Probabilistic prediction of urban water consumption using the SCEM-UA algorithm , 2008 .

[30]  J. Hall,et al.  Coastal cliff recession: the use of probabilistic prediction methods , 2001 .

[31]  Eric Xing,et al.  Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms , 2020, ArXiv.

[32]  Leonard A. Smith,et al.  From ensemble forecasts to predictive distribution functions , 2008 .

[33]  Rachid Guerraoui,et al.  Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.

[34]  Horácio C. Neto,et al.  Moving Deep Learning to the Edge , 2020, Algorithms.

[35]  Julien Cornebise,et al.  Weight Uncertainty in Neural Network , 2015, ICML.

[36]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[37]  Jos'e Miguel Hern'andez-Lobato,et al.  Depth Uncertainty in Neural Networks , 2020, NeurIPS.

[38]  Kilian Q. Weinberger,et al.  Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.

[39]  Andrew Gordon Wilson,et al.  Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs , 2018, NeurIPS.

[40]  E. Ziegel,et al.  Bootstrapping: A Nonparametric Approach to Statistical Inference , 1993 .

[41]  Andrea Vitali,et al.  Bayesian deep learning based method for probabilistic forecast of day-ahead electricity prices , 2019, Applied Energy.

[42]  Wei Zhan,et al.  Probabilistic Prediction of Vehicle Semantic Intention and Motion , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[43]  Ghulam Rasool,et al.  Extended Variational Inference for Propagating Uncertainty in Convolutional Neural Networks , 2019, 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP).

[44]  Andrey Malinin,et al.  Uncertainty in Gradient Boosting via Ensembles , 2021, ICLR.

[45]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[47]  Thomas Hofmann,et al.  Communication-Efficient Distributed Dual Coordinate Ascent , 2014, NIPS.

[48]  Jishnu Mukhoti,et al.  On the Importance of Strong Baselines in Bayesian Deep Learning , 2018, ArXiv.