Differentially-Private Federated Linear Bandits

The rapid proliferation of decentralized learning systems mandates the need for differentially-private cooperative learning. In this paper, we study this in context of the contextual linear bandit: we consider a collection of agents cooperating to solve a common contextual bandit, while ensuring that their communication remains private. For this problem, we devise \textsc{FedUCB}, a multiagent private algorithm for both centralized and decentralized (peer-to-peer) federated learning. We provide a rigorous technical analysis of its utility in terms of regret, improving several results in cooperative bandit learning, and provide rigorous privacy guarantees as well. Our algorithms provide competitive performance both in terms of pseudoregret bounds and empirical benchmark performance in various multi-agent settings.

[1]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[2]  Atilla Eryilmaz,et al.  Stochastic bandits with side observations on networks , 2014, SIGMETRICS '14.

[3]  Elaine Shi,et al.  Privacy-Preserving Stream Aggregation with Fault Tolerance , 2012, Financial Cryptography.

[4]  Roshan Shariff,et al.  Differentially Private Contextual Linear Bandits , 2018, NeurIPS.

[5]  Christos Dimitrakakis,et al.  Differentially private, multi-agent multi-armed bandits , 2015, EWRL 2015.

[6]  Christos Dimitrakakis,et al.  Algorithms for Differentially Private Multi-Armed Bandits , 2015, AAAI.

[7]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[8]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[9]  Nathan Linial,et al.  Locality in Distributed Graph Algorithms , 1992, SIAM J. Comput..

[10]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[11]  Qing Zhao,et al.  Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[13]  Vaibhav Srivastava,et al.  Distributed cooperative decision-making in multiarmed bandits: Frequentist and Bayesian algorithms , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[14]  Vaibhav Srivastava,et al.  On distributed cooperative decision-making in multiarmed bandits , 2015, 2016 European Control Conference (ECC).

[15]  Dawn Song,et al.  Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[16]  Abhimanyu Dubey,et al.  Private and Byzantine-Proof Cooperative Decision-Making , 2022, AAMAS.

[17]  Claudio Gentile,et al.  A Gang of Bandits , 2013, NIPS.

[18]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, TSEC.

[19]  Liwei Wang,et al.  Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication , 2019, ICLR.

[20]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[21]  Qing Wang,et al.  Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit , 2016, KDD.

[22]  Shipra Agrawal,et al.  Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.

[23]  Xiaoyu Chen,et al.  Distributed Bandit Learning: How Much Communication is Needed to Achieve (Near) Optimal Regret , 2019, ArXiv.

[24]  Huan Liu,et al.  Interactive Anomaly Detection on Attributed Networks , 2019, WSDM.

[25]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[26]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[27]  Qing Zhao,et al.  Decentralized multi-armed bandit with multiple distributed players , 2010, 2010 Information Theory and Applications Workshop (ITA).

[28]  Abhimanyu Dubey,et al.  Robust Algorithms for Multiagent Bandits with Heavy Tails , 2020 .

[29]  Alex Pentland,et al.  Data Cooperatives: Towards a Foundation for Decentralized Personal Data Management , 2019, ArXiv.

[30]  Jukka Suomela,et al.  Survey of local algorithms , 2013, CSUR.

[31]  Alex Pentland,et al.  Thompson Sampling on Symmetric Alpha-Stable Bandits , 2019, IJCAI.

[32]  Shuai Li,et al.  Distributed Clustering of Linear Bandits in Peer to Peer Networks , 2016, ICML.

[33]  Vaibhav Srivastava,et al.  Social Imitation in Cooperative Multiarmed Bandits: Partition-Based Algorithms with Strictly Local Information , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[34]  Tianqing Zhu,et al.  Local Differential Privacy and Its Applications: A Comprehensive Survey , 2020, ArXiv.

[35]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[36]  Nello Cristianini,et al.  Finite-Time Analysis of Kernelised Contextual Bandits , 2013, UAI.

[37]  Varun Kanade,et al.  Decentralized Cooperative Stochastic Multi-armed Bandits , 2018, ArXiv.

[38]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[39]  Abhimanyu Dubey,et al.  Kernel Methods for Cooperative Multi-Agent Contextual Bandits , 2020, ICML.

[40]  Esteve Almirall,et al.  Data Ecosystems for Protecting European Citizens' Digital Rights , 2020, Transforming Government: People, Process and Policy.

[41]  Nikita Mishra,et al.  (Nearly) Optimal Differentially Private Stochastic Multi-Arm Bandits , 2015, UAI.

[42]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[43]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[44]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[45]  Erik Ordentlich,et al.  On delayed prediction of individual sequences , 2002, IEEE Trans. Inf. Theory.

[46]  Joelle Pineau,et al.  Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis , 2018, MLHC.

[47]  Abhimanyu Dubey,et al.  Kernel Methods for Cooperative Contextual Bandits , 2020 .

[48]  Feng Fu,et al.  Risk-aware multi-armed bandit problem with application to portfolio selection , 2017, Royal Society Open Science.

[49]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[50]  Adam D. Smith,et al.  (Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings , 2013, NIPS.

[51]  Awni Hannun,et al.  Privacy-Preserving Contextual Bandits , 2019, ArXiv.

[52]  Abhimanyu Dubey,et al.  Cooperative Multi-Agent Bandits with Heavy Tails , 2020, ICML.