Differentially Private Assouad, Fano, and Le Cam

Le Cam's method, Fano's inequality, and Assouad's lemma are three widely used techniques to prove lower bounds for statistical estimation tasks. We propose their analogues under central differential privacy. Our results are simple, easy to apply and we use them to establish sample complexity bounds in several estimation tasks. We establish the optimal sample complexity of discrete distribution estimation under total variation distance and $\ell_2$ distance. We also provide lower bounds for several other distribution classes, including product distributions and Gaussian mixtures that are tight up to logarithmic factors. The technical component of our paper relates coupling between distributions to the sample complexity of estimation under differential privacy.

[1]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[2]  Xinyuan Zhang,et al.  Local Differential Private Data Aggregation for Discrete Distribution Estimation , 2019, IEEE Transactions on Parallel and Distributed Systems.

[3]  Thomas Steinke,et al.  Interactive fingerprinting codes and the hardness of preventing false discovery , 2014, 2016 Information Theory and Applications Workshop (ITA).

[4]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2016, J. Priv. Confidentiality.

[5]  Yichen Wang,et al.  The Cost of Privacy: Optimal Rates of Convergence for Parameter Estimation with Differential Privacy , 2019, The Annals of Statistics.

[6]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[7]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[8]  Aaron Roth,et al.  A learning theory approach to non-interactive database privacy , 2008, STOC.

[9]  Marco Gaboardi,et al.  Local Private Hypothesis Testing: Chi-Square Tests , 2017, ICML.

[10]  Vishesh Karwa,et al.  Finite Sample Differentially Private Confidence Intervals , 2017, ITCS.

[11]  Thomas Steinke,et al.  Tight Lower Bounds for Differentially Private Selection , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[12]  F. Hollander Probability Theory : The Coupling Method , 2012 .

[13]  Peter Kairouz,et al.  Discrete Distribution Estimation under Local Privacy , 2016, ICML.

[14]  N. J. A. Sloane,et al.  Lower bounds for constant weight codes , 1980, IEEE Trans. Inf. Theory.

[15]  John C. Duchi,et al.  Privacy and Statistical Risk: Formalisms and Minimax Bounds , 2014, ArXiv.

[16]  Himanshu Tyagi,et al.  Interactive Inference Under Information Constraints , 2020, IEEE Transactions on Information Theory.

[17]  Johannes Schmidt-Hieber,et al.  The Le Cam distance between density estimation, Poisson processes and Gaussian white noise , 2016, Mathematical Statistics and Learning.

[18]  A. Barg,et al.  Optimal Schemes for Discrete Distribution Estimation Under Locally Differential Privacy , 2017, IEEE Transactions on Information Theory.

[19]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[20]  Sergio Verdú,et al.  Generalizing the Fano inequality , 1994, IEEE Trans. Inf. Theory.

[21]  Jonathan Ullman,et al.  Private Mean Estimation of Heavy-Tailed Distributions , 2020, COLT.

[22]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[23]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[24]  Jayadev Acharya,et al.  Communication Complexity in Locally Private Distribution Estimation and Heavy Hitters , 2019, ICML.

[25]  Thomas Steinke,et al.  Robust Traceability from Trace Amounts , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[26]  Jonathan Ullman,et al.  Fingerprinting Codes and the Price of Approximate Differential Privacy , 2018, SIAM J. Comput..

[27]  Yanjun Han,et al.  Minimax Estimation of Functionals of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[28]  Ronitt Rubinfeld,et al.  Differentially Private Identity and Equivalence Testing of Discrete Distributions , 2018, ICML.

[29]  Raef Bassily,et al.  Linear Queries Estimation with Local Differential Privacy , 2018, AISTATS.

[30]  Constantinos Daskalakis,et al.  Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians , 2013, COLT.

[31]  Janardhan Kulkarni,et al.  Collecting Telemetry Data Privately , 2017, NIPS.

[32]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[33]  L. Devroye A Course in Density Estimation , 1987 .

[34]  Himanshu Tyagi,et al.  Inference Under Information Constraints I: Lower Bounds From Chi-Square Contraction , 2018, IEEE Transactions on Information Theory.

[35]  Clément L. Canonne,et al.  A Survey on Distribution Testing: Your Data is Big. But is it Blue? , 2020, Electron. Colloquium Comput. Complex..

[36]  Thomas Steinke,et al.  Make Up Your Mind: The Price of Online Queries in Differential Privacy , 2016, SODA.

[37]  Alon Orlitsky,et al.  Near-Optimal-Sample Estimators for Spherical Gaussian Mixtures , 2014, NIPS.

[38]  P. Assouad Deux remarques sur l'estimation , 1983 .

[39]  Kobbi Nissim,et al.  Differentially Private Release and Learning of Threshold Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[40]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[41]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[42]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[43]  Shai Ben-David,et al.  Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes , 2018, NeurIPS.

[44]  Salil P. Vadhan,et al.  The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[45]  Amos Beimel,et al.  Bounds on the sample complexity for private learning and private data release , 2010, Machine Learning.

[46]  L. Lecam Convergence of Estimates Under Dimensionality Restrictions , 1973 .

[47]  Thomas Steinke,et al.  Between Pure and Approximate Differential Privacy , 2015, J. Priv. Confidentiality.

[48]  Yihong Wu,et al.  Dualizing Le Cam's method, with applications to estimating the unseens , 2019, ArXiv.

[49]  Constantinos Daskalakis,et al.  Priv'IT: Private and Sample Efficient Identity Testing , 2017, ICML.

[50]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[51]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[52]  Janardhan Kulkarni,et al.  Privately Learning Markov Random Fields , 2020, ICML.

[53]  Liam Paninski,et al.  A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data , 2008, IEEE Transactions on Information Theory.

[54]  Or Sheffet,et al.  Differentially Private Ordinary Least Squares , 2015, ICML.

[55]  Jing Lei,et al.  Differentially Private M-Estimators , 2011, NIPS.

[56]  Thomas Steinke,et al.  Private Hypothesis Selection , 2019, IEEE Transactions on Information Theory.

[57]  Chunming Qiao,et al.  Mutual Information Optimally Local Private Discrete Distribution Estimation , 2016, ArXiv.

[58]  Huanyu Zhang,et al.  INSPECTRE: Privately Estimating the Unseen , 2018, ICML.

[59]  Ilias Diakonikolas,et al.  Differentially Private Learning of Structured Discrete Distributions , 2015, NIPS.

[60]  Himanshu Tyagi,et al.  Test without Trust: Optimal Locally Private Distribution Testing , 2018, AISTATS.

[61]  Huanyu Zhang,et al.  Differentially Private Testing of Identity and Closeness of Discrete Distributions , 2017, NeurIPS.

[62]  Volkan Cevher,et al.  An Introductory Guide to Fano's Inequality with Applications in Statistical Estimation , 2019, Information-Theoretic Methods in Data Science.

[63]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[64]  Yihong Wu,et al.  Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation , 2014, IEEE Transactions on Information Theory.

[65]  Huanyu Zhang,et al.  Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication , 2018, AISTATS.

[66]  Jerry Li,et al.  Privately Learning High-Dimensional Distributions , 2018, COLT.

[67]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[68]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[69]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[70]  Adam D. Smith,et al.  The structure of optimal private tests for simple hypotheses , 2018, STOC.

[71]  C. Papadimitriou,et al.  Algorithmic Approaches to Statistical Questions , 2012 .

[72]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.