Testing Properties of Multiple Distributions with Few Samples

We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from $s$ distributions, $p_1, p_2, \ldots, p_s$, we design testers for the following problems: (1) Uniformity Testing: Testing whether all the $p_i$'s are uniform or $\epsilon$-far from being uniform in $\ell_1$-distance (2) Identity Testing: Testing whether all the $p_i$'s are equal to an explicitly given distribution $q$ or $\epsilon$-far from $q$ in $\ell_1$-distance, and (3) Closeness Testing: Testing whether all the $p_i$'s are equal to a distribution $q$ which we have sample access to, or $\epsilon$-far from $q$ in $\ell_1$-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.

[1]  Ronitt Rubinfeld,et al.  Testing that distributions are close , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[2]  Daniel M. Kane,et al.  Testing Identity of Structured Distributions , 2014, SODA.

[3]  Tugkan Batu,et al.  Generalized Uniformity Testing , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[4]  Clément L. Canonne,et al.  Distribution Testing Lower Bounds via Reductions from Communication Complexity , 2017, Computational Complexity Conference.

[5]  Ronitt Rubinfeld,et al.  Learning and Testing Junta Distributions , 2016, COLT.

[6]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[7]  Gregory Valiant,et al.  Learning Populations of Parameters , 2017, NIPS.

[8]  Sham M. Kakade,et al.  Maximum Likelihood Estimation for Learning Populations of Parameters , 2019, ICML.

[9]  Oded Goldreich The uniform distribution is complete with respect to testing identity to a fixed distribution , 2016, Electron. Colloquium Comput. Complex..

[10]  Liam Paninski,et al.  A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data , 2008, IEEE Transactions on Information Theory.

[11]  P. Diaconis Finite forms of de Finetti's theorem on exchangeability , 1977, Synthese.

[12]  Yihong Wu,et al.  Chebyshev polynomials, moment matching, and optimal estimation of the unseen , 2015, The Annals of Statistics.

[13]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1998, JACM.

[14]  O. Kallenberg Probabilistic Symmetries and Invariance Principles , 2005 .

[15]  Oded Goldreich,et al.  Introduction to Property Testing , 2017 .

[16]  Gregory Valiant,et al.  An Automatic Inequality Prover and Instance Optimal Identity Testing , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[17]  Gregory Valiant,et al.  An Automatic Inequality Prover and Instance Optimal Identity Testing , 2017, SIAM J. Comput..

[18]  Ronitt Rubinfeld,et al.  Testing random variables for independence and identity , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[19]  Ronitt Rubinfeld Taming big probability distributions , 2012, XRDS.

[20]  Dana Ron,et al.  Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[21]  Gregory Valiant,et al.  Estimating the Unseen , 2017, J. ACM.

[22]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[23]  Ilias Diakonikolas,et al.  Optimal Algorithms for Testing Closeness of Discrete Distributions , 2013, SODA.

[24]  Daniel M. Kane,et al.  A New Approach for Testing Properties of Discrete Distributions , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[25]  Ronitt Rubinfeld,et al.  Private Testing of Distributions via Sample Permutations , 2019, NeurIPS.

[26]  Ilias Diakonikolas,et al.  Sample-Optimal Identity Testing with High Probability , 2017, Electron. Colloquium Comput. Complex..

[27]  Ronitt Rubinfeld,et al.  Testing Properties of Collections of Distributions , 2013, Theory Comput..

[28]  Clément L. Canonne,et al.  A Survey on Distribution Testing: Your Data is Big. But is it Blue? , 2020, Electron. Colloquium Comput. Complex..

[29]  Ronitt Rubinfeld,et al.  Testing Closeness of Discrete Distributions , 2010, JACM.

[30]  Ilias Diakonikolas,et al.  Collision-based Testers are Optimal for Uniformity and Closeness , 2016, Electron. Colloquium Comput. Complex..

[31]  Dana Ron,et al.  On Testing Expansion in Bounded-Degree Graphs , 2000, Studies in Complexity and Cryptography.

[32]  Zdravko Cvetkovski,et al.  Inequalities: Theorems, Techniques and Selected Problems , 2012 .

[33]  Constantinos Daskalakis,et al.  Optimal Testing for Properties of Distributions , 2015, NIPS.

[34]  D. Freedman,et al.  Finite Exchangeable Sequences , 1980 .

[35]  Paul Valiant Testing symmetric properties of distributions , 2008, STOC '08.