Canonical Noise Distributions and Private Hypothesis Tests

$f$-DP has recently been proposed as a generalization of differential privacy allowing a lossless analysis of composition, post-processing, and privacy amplification via subsampling. In the setting of $f$-DP, we propose the concept of a canonical noise distribution (CND), the first mechanism designed for an arbitrary $f$-DP guarantee. The notion of CND captures whether an additive privacy mechanism perfectly matches the privacy guarantee of a given $f$. We prove that a CND always exists, and give a construction that produces a CND for any $f$. We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private $p$-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data, within the general $f$-DP framework. We apply our techniques to the problem of difference of proportions testing, and construct a UMP unbiased (UMPU)"semi-private"test which upper bounds the performance of any $f$-DP test. Using this as a benchmark we propose a private test, based on the inversion of characteristic functions, which allows for optimal inference for the two population parameters and is nearly as powerful as the semi-private UMPU. When specialized to the case of $(\epsilon,0)$-DP, we show empirically that our proposed test is more powerful than any $(\epsilon/\sqrt 2)$-DP test and has more accurate type I errors than the classic normal approximation test.

[1]  Jinshuo Dong,et al.  Log-Concave and Multivariate Canonical Noise Distributions for Differential Privacy , 2022, NeurIPS.

[2]  Ruobin Gong,et al.  Subspace Differential Privacy , 2021, AAAI.

[3]  Ananda Theertha Suresh,et al.  Robust hypothesis testing and distribution estimation in Hellinger distance , 2020, AISTATS.

[4]  Jordan Awan,et al.  One Step to Efficient Synthetic Data , 2020, ArXiv.

[5]  Robert L. Wolpert,et al.  Statistical Inference , 2019, Encyclopedia of Social Network Analysis and Mining.

[6]  Thomas Steinke,et al.  Private Hypothesis Selection , 2019, IEEE Transactions on Information Theory.

[7]  Matthew Reimherr,et al.  Elliptical Perturbations for Differential Privacy , 2019, NeurIPS.

[8]  Aaron Roth,et al.  Gaussian differential privacy , 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[9]  Aleksandra B. Slavkovic,et al.  Differentially Private Inference for Binomial Data , 2019, J. Priv. Confidentiality.

[10]  Daniel Kifer,et al.  Statistical Approximating Distributions Under Differential Privacy , 2018, J. Priv. Confidentiality.

[11]  Adam D. Smith,et al.  The structure of optimal private tests for simple hypotheses , 2018, STOC.

[12]  Ronitt Rubinfeld,et al.  Differentially Private Identity and Equivalence Testing of Discrete Distributions , 2018, ICML.

[13]  Thomas Steinke,et al.  Composable and versatile privacy via truncated CDP , 2018, STOC.

[14]  Aleksandra B. Slavkovic,et al.  Differentially Private Uniformly Most Powerful Tests for Binomial Data , 2018, NeurIPS.

[15]  Aleksandra Slavkovic,et al.  Structure and Sensitivity in Differential Privacy: Comparing K-Norm Mechanisms , 2018, Journal of the American Statistical Association.

[16]  David Colquhoun,et al.  The reproducibility of research and the misinterpretation of p-values , 2017, bioRxiv.

[17]  Matthew Reimherr,et al.  Formal Privacy for Functional Data with Gaussian Perturbations , 2017, ICML.

[18]  Vishesh Karwa,et al.  Finite Sample Differentially Private Confidence Intervals , 2017, ITCS.

[19]  Marco Gaboardi,et al.  Local Private Hypothesis Testing: Chi-Square Tests , 2017, ICML.

[20]  Huanyu Zhang,et al.  Differentially Private Testing of Identity and Closeness of Discrete Distributions , 2017, NeurIPS.

[21]  Jun Sakuma,et al.  Differentially Private Chi-squared Test by Unit Circle Mechanism , 2017, ICML.

[22]  Ashwin Machanavajjhala,et al.  Differentially Private Significance Tests for Regression Coefficients , 2017, Journal of Computational and Graphical Statistics.

[23]  Constantinos Daskalakis,et al.  Priv'IT: Private and Sample Efficient Identity Testing , 2017, ICML.

[24]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[25]  Daniel Kifer,et al.  A New Class of Private Chi-Square Tests , 2016, ArXiv.

[26]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[27]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[28]  Martin J. Wainwright,et al.  Minimax Optimal Procedures for Locally Private Estimation , 2016, ArXiv.

[29]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[30]  Ryan M. Rogers,et al.  Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing , 2016, ICML 2016.

[31]  Daniel Kifer,et al.  Revisiting Differentially Private Hypothesis Tests for Categorical Data , 2015 .

[32]  Or Sheffet,et al.  Differentially Private Ordinary Least Squares , 2015, ICML.

[33]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[34]  Eftychia Solea,et al.  Differentially Private Hypothesis Testing For Normal Random Variables. , 2014 .

[35]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[36]  Pramod Viswanath,et al.  The Optimal Noise-Adding Mechanism in Differential Privacy , 2012, IEEE Transactions on Information Theory.

[37]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[38]  Stephen E. Fienberg,et al.  Privacy-Preserving Data Sharing for Genome-Wide Association Studies , 2012, J. Priv. Confidentiality.

[39]  Larry A. Wasserman,et al.  Differential privacy for functions and functional data , 2012, J. Mach. Learn. Res..

[40]  Aleksandra B. Slavkovic,et al.  Differential Privacy for Clinical Trial Data: Preliminary Evaluations , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[41]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[42]  Hanxiang Peng,et al.  A unified approach for analyzing exchangeable binary data with applications to developmental toxicity studies , 2009, Statistics in medicine.

[43]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[44]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[45]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[46]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[47]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[48]  María Dolores Ugarte,et al.  Probability and Statistics with R , 2008 .

[49]  Agner Fog,et al.  Sampling Methods for Wallenius' and Fisher's Noncentral Hypergeometric Distributions , 2008, Commun. Stat. Simul. Comput..

[50]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[51]  C. Geyer,et al.  Fuzzy and randomized confidence intervals and P-values , 2005 .

[52]  James M. Robins,et al.  Asymptotic Distribution of P Values in Composite Null Models , 2000 .

[53]  Michael Evans,et al.  Asymptotic Distribution of P Values in Composite Null Models: Comment , 2000 .

[54]  W. Harkness Properties of the extended hypergeometric distribution , 1965 .

[55]  Vanessa de Cruz Medina Private correspondence , 1991 .

[56]  Theodore Shifrin,et al.  Multivariable Mathematics: Linear Algebra, Multivariable Calculus, and Manifolds , 2004 .

[57]  F. Y. Edgeworth,et al.  The theory of statistics , 1996 .

[58]  D. Blackwell Comparison of Experiments , 1951 .