Homomorphically Encrypted Linear Contextual Bandit

Contextual bandit is a general framework for online learning in sequential decision-making problems that has found application in a large range of domains, including recommendation system, online advertising, clinical trials and many more. A critical aspect of bandit methods is that they require to observe the contexts –i.e., individual or group-level data– and the rewards in order to solve the sequential problem. The large deployment in industrial applications has increased interest in methods that preserve the privacy of the users. In this paper, we introduce a privacy-preserving bandit framework based on asymmetric encryption. The bandit algorithm only observes encrypted information (contexts and rewards) and has no ability to decrypt it. Leveraging homomorphic encryption, we show that despite the complexity of the setting, it is possible to learn over encrypted data. We introduce an algorithm that achieves a Õ(d √ T ) regret bound in any linear contextual bandit problem, while keeping data encrypted.

[1]  Jung Hee Cheon,et al.  Efficient Logistic Regression on Large Encrypted Data , 2018, IACR Cryptol. ePrint Arch..

[2]  Taher El Gamal A public key cryptosystem and a signature scheme based on discrete logarithms , 1984, IEEE Trans. Inf. Theory.

[3]  Chris Peikert,et al.  On Ideal Lattices and Learning with Errors over Rings , 2010, JACM.

[4]  Mohsen Bayati,et al.  Online Decision-Making with High-Dimensional Covariates , 2015 .

[5]  Yu Bai,et al.  Provably Efficient Q-Learning with Low Switching Cost , 2019, NeurIPS.

[6]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[7]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[8]  Pascal Lafourcade,et al.  Secure Cumulative Reward Maximization in Linear Stochastic Bandits , 2020, ProvSec.

[9]  Jung Hee Cheon,et al.  Efficient Homomorphic Comparison Methods with Optimal Complexity , 2019, IACR Cryptol. ePrint Arch..

[10]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.

[11]  Zhaowei Zhu,et al.  Federated Bandit: A Gossiping Approach , 2021, SIGMETRICS.

[12]  Vianney Perchet,et al.  Batched Bandit Problems , 2015, COLT.

[13]  Pascal Lafourcade,et al.  Secure Best Arm Identification in Multi-armed Bandits , 2019, ISPEC.

[14]  Oded Regev,et al.  On lattices, learning with errors, random linear codes, and cryptography , 2005, STOC '05.

[15]  Raphaël Féraud,et al.  Context Attentive Bandits: Contextual Bandit with Restricted Context , 2017, IJCAI.

[16]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[17]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[18]  Léo Ducas,et al.  FHEW: Bootstrapping Homomorphic Encryption in Less Than a Second , 2015, EUROCRYPT.

[19]  Martin R. Albrecht,et al.  On the concrete hardness of Learning with Errors , 2015, J. Math. Cryptol..

[20]  Yuan Zhou,et al.  Linear bandits with limited adaptivity and learning distributional optimal design , 2020, STOC.

[21]  Jonathan Ullman,et al.  Manipulation Attacks in Local Differential Privacy , 2019, 2021 IEEE Symposium on Security and Privacy (SP).

[22]  Roshan Shariff,et al.  Differentially Private Contextual Linear Bandits , 2018, NeurIPS.

[23]  Craig Gentry,et al.  A fully homomorphic encryption scheme , 2009 .

[24]  Mauro Conti,et al.  A Survey on Homomorphic Encryption Schemes: Theory and Implementation , 2017 .

[25]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[26]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[27]  Daniele Calandriello,et al.  Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification , 2020, ICML.

[28]  Narayanan Sadagopan,et al.  Contextual Multi-Armed Bandits for Causal Marketing , 2018, ArXiv.

[29]  Andreas Krause,et al.  Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[30]  Shafi Goldwasser,et al.  Machine Learning Classification over Encrypted Data , 2015, NDSS.

[31]  Jung Hee Cheon,et al.  Numerical Methods for Comparison on Homomorphically Encrypted Numbers , 2019, IACR Cryptol. ePrint Arch..

[32]  Xiufeng Zhao,et al.  Generalized Bootstrapping Technique Based on Block Equality Test Algorithm , 2018, Secur. Commun. Networks.

[33]  Jung Hee Cheon,et al.  Homomorphic Encryption for Arithmetic of Approximate Numbers , 2017, ASIACRYPT.

[34]  Vianney Perchet,et al.  The multi-armed bandit problem with covariates , 2011, ArXiv.

[35]  Yingkai Li,et al.  Multinomial Logit Bandit with Low Switching Cost , 2020, ICML.

[36]  Shafi Goldwasser,et al.  Secure large-scale genome-wide association studies using homomorphic encryption , 2020, Proceedings of the National Academy of Sciences.

[37]  Yanjun Han,et al.  Sequential Batch Learning in Finite-Action Linear Contextual Bandits , 2020, ArXiv.

[38]  Erich Kaltofen,et al.  On the complexity of computing determinants , 2001, computational complexity.

[39]  Shai Halevi,et al.  Homomorphic Encryption , 2017, Tutorials on the Foundations of Cryptography.

[40]  Zvika Brakerski,et al.  Fully Homomorphic Encryption without Modulus Switching from Classical GapSVP , 2012, CRYPTO.

[41]  Louis J. M. Aslett,et al.  Encrypted Accelerated Least Squares Regression , 2017, AISTATS.

[42]  Chris Peikert,et al.  A Toolkit for Ring-LWE Cryptography , 2013, IACR Cryptol. ePrint Arch..

[43]  Xiaoqian Jiang,et al.  Secure Outsourced Matrix Computation and Application to Neural Networks , 2018, CCS.

[44]  Jung Hee Cheon,et al.  APPLICATIONS OF HOMOMORPHIC ENCRYPTION , 2017 .