Differentially Private Identity and Equivalence Testing of Discrete Distributions

We study the fundamental problems of identity and equivalence testing over a discrete population from random samples. Our goal is to develop efficient testers while guaranteeing differential privacy to the individuals of the population. We provide sample-efficient differentially private testers for these problems. Our theoretical results significantly improve over the best known algorithms for identity testing, and are the first results for private equivalence testing. The conceptual message of our work is that there exist private hypothesis testers that are nearly as sample-efficient as their non-private counterparts. We perform an experimental evaluation of our algorithms on synthetic data. Our experiments illustrate that our private testers achieve small type I and type II errors with sample size sublinear in the domain size of the underlying distributions.

[1]  Sean P. Meyn,et al.  Generalized Error Exponents for Small Sample Universal Hypothesis Testing , 2012, IEEE Transactions on Information Theory.

[2]  Daniel M. Kane,et al.  Testing Identity of Structured Distributions , 2014, SODA.

[3]  Seshadhri Comandur,et al.  Testing Expansion in Bounded Degree Graphs , 2007, Electron. Colloquium Comput. Complex..

[4]  Daniel M. Kane,et al.  Near-Optimal Closeness Testing of Discrete Histogram Distributions , 2017, ICALP.

[5]  Liam Paninski,et al.  A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data , 2008, IEEE Transactions on Information Theory.

[6]  Constantinos Daskalakis,et al.  Optimal Testing for Properties of Distributions , 2015, NIPS.

[7]  Ronitt Rubinfeld Taming big probability distributions , 2012, XRDS.

[8]  Daniel M. Kane,et al.  Optimal Algorithms and Lower Bounds for Testing Closeness of Structured Distributions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[9]  Daniel Kifer,et al.  A New Class of Private Chi-Square Tests , 2016, ArXiv.

[10]  Ryan M. Rogers,et al.  Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing , 2016, ICML 2016.

[11]  Gregory Valiant,et al.  An Automatic Inequality Prover and Instance Optimal Identity Testing , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[12]  Yue Wang,et al.  Differentially Private Hypothesis Testing, Revisited , 2015, ArXiv.

[13]  Huanyu Zhang,et al.  Differentially Private Testing of Identity and Closeness of Discrete Distributions , 2017, NeurIPS.

[14]  Ilias Diakonikolas,et al.  Differentially Private Learning of Structured Discrete Distributions , 2015, NIPS.

[15]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[16]  Ilias Diakonikolas,et al.  Optimal Algorithms for Testing Closeness of Discrete Distributions , 2013, SODA.

[17]  Daniel M. Kane,et al.  A New Approach for Testing Properties of Discrete Distributions , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[18]  Rocco A. Servedio,et al.  Testing k-Modal Distributions: Optimal Algorithms via Reductions , 2011, SODA.

[19]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[20]  Daniel M. Kane,et al.  Testing Bayesian Networks , 2016, IEEE Transactions on Information Theory.

[21]  Constantinos Daskalakis,et al.  Priv'IT: Private and Sample Efficient Identity Testing , 2017, ICML.

[22]  Jun Sakuma,et al.  Differentially Private Chi-squared Test by Unit Circle Mechanism , 2017, ICML.

[23]  Ronitt Rubinfeld,et al.  Testing Shape Restrictions of Discrete Distributions , 2015, Theory of Computing Systems.

[24]  Dana Ron,et al.  On Testing Expansion in Bounded-Degree Graphs , 2000, Studies in Complexity and Cryptography.

[25]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[26]  Ilias Diakonikolas,et al.  Sample-Optimal Identity Testing with High Probability , 2017, Electron. Colloquium Comput. Complex..

[27]  Clément L. Canonne,et al.  A Survey on Distribution Testing: Your Data is Big. But is it Blue? , 2020, Electron. Colloquium Comput. Complex..

[28]  HighWire Press Philosophical transactions of the Royal Society of London. Series A, Containing papers of a mathematical or physical character , 1896 .

[29]  Ronitt Rubinfeld,et al.  Testing that distributions are close , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[30]  Ilias Diakonikolas,et al.  Collision-based Testers are Optimal for Uniformity and Closeness , 2016, Electron. Colloquium Comput. Complex..

[31]  Ronitt Rubinfeld,et al.  Testing Closeness of Discrete Distributions , 2010, JACM.