Optimal query complexity for private sequential learning

Motivated by privacy concerns in many practical applications such as Federated Learning, we study a stylized private sequential learning problem: a learner tries to estimate an unknown scalar value, by sequentially querying an external database and receiving binary responses; meanwhile, a third-party adversary observes the learner's queries but not the responses. The learner's goal is to design a querying strategy with the minimum number of queries (optimal query complexity) so that she can accurately estimate the true value, while the adversary even with the complete knowledge of her querying strategy cannot. Prior work has obtained both upper and lower bounds on the optimal query complexity, however, these upper and lower bounds have a large gap in general. In this paper, we construct new querying strategies and prove almost matching upper and lower bounds, providing a complete characterization of the optimal query complexity as a function of the estimation accuracy and the desired levels of privacy.

[1]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[2]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[3]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[4]  Peter I. Frazier,et al.  A Bayesian approach to stochastic root finding , 2011, Proceedings of the 2011 Winter Simulation Conference (WSC).

[5]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[6]  Michael Horstein,et al.  Sequential transmission using noiseless feedback , 1963, IEEE Trans. Inf. Theory.

[7]  Lili Su,et al.  Securing Distributed Gradient Descent in High Dimensional Statistical Learning , 2018, Proc. ACM Meas. Anal. Comput. Syst..

[8]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[9]  Lili Su,et al.  Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent , 2017, Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems.

[10]  R. Hardwarsing Stochastic Gradient Descent with Differentially Private Updates , 2018 .

[11]  Twenty Questions with Noise: Bayes Optimal Policies for Entropy Loss , 2012, Journal of Applied Probability.

[12]  Martin J. Wainwright,et al.  Distance-based and continuum Fano inequalities with applications to statistical estimation , 2013, ArXiv.

[13]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[14]  John N. Tsitsiklis,et al.  Private Sequential Learning , 2018, COLT.

[15]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[16]  Kuang Xu,et al.  Query Complexity of Bayesian Private Learning , 2019, NeurIPS.

[17]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[18]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.