论文信息 - Minimax Rates of Estimating Approximate Differential Privacy

Minimax Rates of Estimating Approximate Differential Privacy

Differential privacy has become a widely accepted notion of privacy, leading to the introduction and deployment of numerous privatization mechanisms. However, ensuring the privacy guarantee is an error-prone process, both in designing mechanisms and in implementing those mechanisms. Both types of errors will be greatly reduced, if we have a data-driven approach to verify privacy guarantees, from a black-box access to a mechanism. We pose it as a property estimation problem, and study the fundamental trade-offs involved in the accuracy in estimated privacy guarantees and the number of samples required. We introduce a novel estimator that uses polynomial approximation of a carefully chosen degree to optimally trade-off bias and variance. With $n$ samples, we show that this estimator achieves performance of a straightforward plug-in estimator with $n \ln n$ samples, a phenomenon referred to as effective sample size amplification. The minimax optimality of the proposed estimator is proved by comparing it to a matching fundamental lower bound.

Xiyang Liu | Sewoong Oh | Sewoong Oh | Xiyang Liu

[1] S. Bernstein. Sur la meilleure approximation de |x| par des polynomes de degrés donnés , 1914 .

[2] C. Withers. Bias reduction by Taylor series , 1987 .

[3] George G. Lorentz,et al. Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[4] Cynthia Dwork,et al. Differential Privacy , 2006, ICALP.

[5] Harald Niederreiter,et al. Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[6] Kunal Talwar,et al. Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[7] Frank McSherry,et al. Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[8] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[9] Benjamin C. Pierce,et al. Distance makes the types grow stronger: a calculus for differential privacy , 2010, ICFP '10.

[10] T. Cai,et al. Testing composite hypotheses, Hermite polynomials and optimal estimation of a nonsmooth functional , 2011, 1105.3039.

[11] Aaron Roth. The Algorithmic Foundations of Data Privacy September 20 , 2011 Lecture 4 , 2011 .

[12] Sumit Gulwani,et al. Proving programs robust , 2011, ESEC/FSE '11.

[13] Universally Utility-maximizing Privacy Mechanisms , 2012, SIAM J. Comput..

[14] Gilles Barthe,et al. Probabilistic Relational Reasoning for Differential Privacy , 2012, TOPL.

[15] H. Mhaskar,et al. Applications of classical approximation theory to periodic basis function networks and computational harmonic analysis , 2013 .

[16] Sofya Raskhodnikova,et al. Testing the Lipschitz Property over Product Distributions with Applications to Data Privacy , 2013, TCC.

[17] Chris Clifton,et al. Top-k frequent itemsets via differentially private FP-trees , 2014, KDD.

[18] Ashwin Machanavajjhala,et al. Differentially Private Algorithms for Empirical Machine Learning , 2014, ArXiv.

[19] Yanjun Han,et al. Minimax Estimation of Discrete Distributions under ℓ1 Loss , 2014, ArXiv.

[20] Úlfar Erlingsson,et al. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[21] Pramod Viswanath,et al. Extremal Mechanisms for Local Differential Privacy , 2014, J. Mach. Learn. Res..

[22] Ashwin Machanavajjhala,et al. Pufferfish , 2014, ACM Trans. Database Syst..

[23] Yanjun Han,et al. Minimax Estimation of Functionals of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[24] Ashwin Machanavajjhala,et al. On the Privacy Properties of Variants on the Sparse Vector Technique , 2015, ArXiv.

[25] Pramod Viswanath,et al. The Staircase Mechanism in Differential Privacy , 2015, IEEE Journal of Selected Topics in Signal Processing.

[26] Yu Zhang,et al. Differentially Private High-Dimensional Data Publication via Sampling-Based Inference , 2015, KDD.

[27] Pramod Viswanath,et al. Secure Multi-party Differential Privacy , 2015, NIPS.

[28] Yanjun Han,et al. Minimax Estimation of Discrete Distributions Under $\ell _{1}$ Loss , 2014, IEEE Transactions on Information Theory.

[29] Yanjun Han,et al. Minimax estimation of the L1 distance , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[30] Yanjun Han,et al. Minimax Estimation of KL Divergence between Discrete Distributions , 2016, ArXiv.

[31] Úlfar Erlingsson,et al. Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries , 2015, Proc. Priv. Enhancing Technol..

[32] Guy N. Rothblum,et al. Concentrated Differential Privacy , 2016, ArXiv.

[33] Yihong Wu,et al. Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation , 2014, IEEE Transactions on Information Theory.

[34] Sreeram Kannan,et al. Estimating Mutual Information for Discrete-Continuous Mixtures , 2017, NIPS.

[35] Himanshu Tyagi,et al. Estimating Renyi Entropy of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[36] Yizhen Wang,et al. Pufferfish Privacy Mechanisms for Correlated Data , 2016, SIGMOD Conference.

[37] Janardhan Kulkarni,et al. Collecting Telemetry Data Privately , 2017, NIPS.

[38] Ninghui Li,et al. Understanding the Sparse Vector Technique for Differential Privacy , 2016, Proc. VLDB Endow..

[39] Pramod Viswanath,et al. The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[40] Gregory Valiant,et al. Estimating the Unseen , 2017, J. ACM.

[41] Olivier Bachem,et al. Assessing Generative Models via Precision and Recall , 2018, NeurIPS.

[42] Demystifying Fixed k-Nearest Neighbor Information Estimators , 2018, IEEE Trans. Inf. Theory.

[43] Anna C. Gilbert,et al. Property Testing For Differential Privacy , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[44] John M. Abowd,et al. The U.S. Census Bureau Adopts Differential Privacy , 2018, KDD.

[45] Pramod Viswanath,et al. Breaking the Bandwidth Barrier: Geometrical Adaptive Entropy Estimation , 2016, IEEE Transactions on Information Theory.

[46] Ram Rajagopal,et al. Generative Adversarial Privacy: A Data-Driven Approach to Information-Theoretic Privacy , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[47] Constantinos Daskalakis,et al. Which Distribution Distances are Sublinearly Testable? , 2017, Electron. Colloquium Comput. Complex..

[48] Yanjun Han,et al. The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal , 2017, NeurIPS.

[49] Yanjun Han,et al. Minimax Estimation of the $L_{1}$ Distance , 2018, IEEE Transactions on Information Theory.

[50] Danfeng Zhang,et al. Detecting Violations of Differential Privacy , 2018, CCS.

[51] Thomas B. Berrett,et al. Efficient multivariate entropy estimation via $k$-nearest neighbour distances , 2016, The Annals of Statistics.

[52] Danfeng Zhang,et al. Proving differential privacy with shadow execution , 2019, PLDI.

[53] Yihong Wu,et al. Chebyshev polynomials, moment matching, and optimal estimation of the unseen , 2015, The Annals of Statistics.

[54] Ashish Khetan,et al. PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.