Differentially Private Uniformly Most Powerful Tests for Binomial Data

We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests for exchangeable data can always be expressed as a function of the empirical distribution. Using this structure, we prove a `Neyman-Pearson lemma' for binomial data under DP, where the DP-UMP only depends on the sample sum. Our tests can also be stated as a post-processing of a random variable, whose distribution we coin "Truncated-Uniform-Laplace" (Tulap), a generalization of the Staircase and discrete Laplace distributions. Furthermore, we obtain exact p-values, which are easily computed in terms of the Tulap random variable. We show that our results also apply to distribution-free hypothesis tests for continuous data. Our simulation results demonstrate that our tests have exact type I error, and are more powerful than current techniques.

[1]  K. Singh,et al.  Confidence Distribution, the Frequentist Distribution Estimator of a Parameter: A Review , 2013 .

[2]  Stephen E. Fienberg,et al.  Privacy-Preserving Data Sharing for Genome-Wide Association Studies , 2012, J. Priv. Confidentiality.

[3]  Ashwin Machanavajjhala,et al.  Differentially Private Significance Tests for Regression Coefficients , 2017, Journal of Computational and Graphical Statistics.

[4]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[5]  Vishesh Karwa,et al.  Finite Sample Differentially Private Confidence Intervals , 2017, ITCS.

[6]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[7]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[8]  Pramod Viswanath,et al.  The Optimal Noise-Adding Mechanism in Differential Privacy , 2012, IEEE Transactions on Information Theory.

[9]  Aleksandra Slavkovic,et al.  Structure and Sensitivity in Differential Privacy: Comparing K-Norm Mechanisms , 2018, Journal of the American Statistical Association.

[10]  Tomasz J. Kozubowski,et al.  A discrete analogue of the Laplace distribution , 2006 .

[11]  Pramod Viswanath,et al.  Optimal Noise Adding Mechanisms for Approximate Differential Privacy , 2016, IEEE Transactions on Information Theory.

[12]  Ryan M. Rogers,et al.  Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing , 2016, ICML 2016.

[13]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[14]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[15]  Martin J. Wainwright,et al.  Minimax Optimal Procedures for Locally Private Estimation , 2016, ArXiv.

[16]  Daniel Kifer,et al.  Revisiting Differentially Private Hypothesis Tests for Categorical Data , 2015 .

[17]  Or Sheffet,et al.  Differentially Private Ordinary Least Squares , 2015, ICML.

[18]  Eftychia Solea,et al.  Differentially Private Hypothesis Testing For Normal Random Variables. , 2014 .

[19]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[20]  Aleksandra B. Slavkovic,et al.  Differential Privacy for Clinical Trial Data: Preliminary Evaluations , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[21]  Marco Gaboardi,et al.  Local Private Hypothesis Testing: Chi-Square Tests , 2017, ICML.