Distributed Simulation and Distributed Inference

Independent samples from an unknown probability distribution $\bf p$ on a domain of size $k$ are distributed across $n$ players, with each player holding one sample. Each player can communicate $\ell$ bits to a central referee in a simultaneous message passing model of communication to help the referee infer a property of the unknown $\bf p$. What is the least number of players for inference required in the communication-starved setting of $\ell<\log k$? We begin by exploring a general "simulate-and-infer" strategy for such inference problems where the center simulates the desired number of samples from the unknown distribution and applies standard inference algorithms for the collocated setting. Our first result shows that for $\ell<\log k$ perfect simulation of even a single sample is not possible. Nonetheless, we present a Las Vegas algorithm that simulates a single sample from the unknown distribution using $O(k/2^\ell)$ samples in expectation. As an immediate corollary, we get that simulate-and-infer attains the optimal sample complexity of $\Theta(k^2/2^\ell\epsilon^2)$ for learning the unknown distribution to total variation distance $\epsilon$. For the prototypical testing problem of identity testing, simulate-and-infer works with $O(k^{3/2}/2^\ell\epsilon^2)$ samples, a requirement that seems to be inherent for all communication protocols not using any additional resources. Interestingly, we can break this barrier using public coins. Specifically, we exhibit a public-coin communication protocol that performs identity testing using $O(k/\sqrt{2^\ell}\epsilon^2)$ samples. Furthermore, we show that this is optimal up to constant factors. Our theoretically sample-optimal protocol is easy to implement in practice. Our proof of lower bound entails showing a contraction in $\chi^2$ distance of product distributions due to communication constraints and may be of independent interest.

[1]  Dana Ron,et al.  Property testing and its connection to learning and approximation , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[2]  Te Han,et al.  Hypothesis testing with multiterminal data compression , 1987, IEEE Trans. Inf. Theory.

[3]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[4]  Yanjun Han,et al.  Geometric Lower Bounds for Distributed Parameter Estimation Under Communication Constraints , 2018, IEEE Transactions on Information Theory.

[5]  Gregory Valiant,et al.  A CLT and tight lower bounds for estimating entropy , 2010, Electron. Colloquium Comput. Complex..

[6]  Dana Ron,et al.  Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem , 2009, SIAM J. Comput..

[7]  Yanjun Han,et al.  Minimax Estimation of Functionals of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[8]  Dana Ron,et al.  On Testing Expansion in Bounded-Degree Graphs , 2000, Studies in Complexity and Cryptography.

[9]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 2002, JACM.

[10]  Aaron D. Wyner,et al.  The common information of two dependent random variables , 1975, IEEE Trans. Inf. Theory.

[11]  Thomas Watson,et al.  Communication Complexity of Statistical Distance , 2018, Electron. Colloquium Comput. Complex..

[12]  Ronitt Rubinfeld,et al.  Sublinear algorithms for testing monotone and unimodal distributions , 2004, STOC '04.

[13]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[14]  Clément L. Canonne,et al.  A Survey on Distribution Testing: Your Data is Big. But is it Blue? , 2020, Electron. Colloquium Comput. Complex..

[15]  Huanyu Zhang,et al.  Communication Efficient, Sample Optimal, Linear Time Locally Private Discrete Distribution Estimation , 2018, ArXiv.

[16]  Yanjun Han,et al.  Distributed Statistical Estimation of High-Dimensional and Nonparametric Distributions , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[17]  Liam Paninski,et al.  A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data , 2008, IEEE Transactions on Information Theory.

[18]  Sivaraman Balakrishnan,et al.  Hypothesis Testing for High-Dimensional Multinomials: A Selective Review , 2017, ArXiv.

[19]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[20]  Himanshu Tyagi,et al.  Estimating Renyi Entropy of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[21]  Liam Paninski,et al.  Estimating entropy on m bins given fewer than m samples , 2004, IEEE Transactions on Information Theory.

[22]  Jerry Li,et al.  Communication-Efficient Distributed Learning of Discrete Distributions , 2017, NIPS.

[23]  Tengyu Ma,et al.  On Communication Cost of Distributed Statistical Estimation and Dimensionality , 2014, NIPS.

[24]  Elchanan Mossel,et al.  Non interactive simulation of correlated distributions is decidable , 2018, SODA.

[25]  Shun-ichi Amari,et al.  Statistical Inference Under Multiterminal Data Compression , 1998, IEEE Trans. Inf. Theory.

[26]  Ilias Diakonikolas,et al.  Learning Structured Distributions , 2016, Handbook of Big Data.

[27]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[28]  Thomas Holenstein,et al.  Parallel repetition: simplifications and the no-signaling case , 2007, STOC '07.

[29]  Gregory Valiant,et al.  An Automatic Inequality Prover and Instance Optimal Identity Testing , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[30]  Seshadhri Comandur,et al.  Testing Expansion in Bounded Degree Graphs , 2007, Electron. Colloquium Comput. Complex..

[31]  Himanshu Tyagi,et al.  Test without Trust: Optimal Locally Private Distribution Testing , 2018, AISTATS.

[32]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[33]  Ronald L. Rivest,et al.  The Optimality of Correlated Sampling , 2016, Electron. Colloquium Comput. Complex..

[34]  Ilias Diakonikolas,et al.  Optimal Algorithms for Testing Closeness of Discrete Distributions , 2013, SODA.

[35]  Himanshu Tyagi,et al.  Extra Samples can Reduce the Communication for Independence Testing , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[36]  A. Razborov Communication Complexity , 2011 .

[37]  Rudolf Ahlswede,et al.  Hypothesis testing with communication constraints , 1986, IEEE Trans. Inf. Theory.

[38]  Ronitt Rubinfeld,et al.  Testing that distributions are close , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[39]  Shun Watanabe,et al.  Neyman-Pearson test for zero-rate multiterminal hypothesis testing , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).

[40]  Gregory Valiant,et al.  Estimating the unseen: A sublinear-sample canonical estimator of distributions , 2010, Electron. Colloquium Comput. Complex..

[41]  Martin J. Wainwright,et al.  Information-theoretic lower bounds for distributed statistical estimation with communication constraints , 2013, NIPS.

[42]  Yihong Wu,et al.  Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation , 2014, IEEE Transactions on Information Theory.

[43]  Clément L. Canonne,et al.  Distribution Testing Lower Bounds via Reductions from Communication Complexity , 2017, Computational Complexity Conference.

[44]  Ronitt Rubinfeld,et al.  Testing Shape Restrictions of Discrete Distributions , 2015, Theory of Computing Systems.

[45]  Ronitt Rubinfeld Taming big probability distributions , 2012, XRDS.

[46]  Ohad Shamir,et al.  Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation , 2013, NIPS.

[47]  M. Wigger,et al.  Testing against independence with multiple decision centers , 2016, 2016 International Conference on Signal Processing and Communications (SPCOM).

[48]  Oded Goldreich On Multiple Input Problems in Property Testing , 2013, Electron. Colloquium Comput. Complex..

[49]  Venkat Anantharam,et al.  Non-interactive simulation of joint distributions: The Hirschfeld-Gebelein-Rényi maximal correlation and the hypercontractivity ribbon , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[50]  Madhu Sudan,et al.  Decidability of Non-interactive Simulation of Joint Distributions , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[51]  Yu Xiang,et al.  Interactive hypothesis testing against independence , 2013, 2013 IEEE International Symposium on Information Theory.

[52]  Yanjun Han,et al.  Maximum Likelihood Estimation of Functionals of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[53]  Oded Goldreich The uniform distribution is complete with respect to testing identity to a fixed distribution , 2016, Electron. Colloquium Comput. Complex..

[54]  David P. Woodruff,et al.  Communication lower bounds for statistical estimation problems via a distributed data processing inequality , 2015, STOC.

[55]  Daniel M. Kane,et al.  Testing Identity of Structured Distributions , 2014, SODA.

[56]  Maxim Raginsky,et al.  Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation , 2016, IEEE Transactions on Information Theory.

[57]  Alon Orlitsky,et al.  A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions , 2017, ICML.

[58]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[59]  Maria-Florina Balcan,et al.  Distributed Learning, Communication Complexity and Privacy , 2012, COLT.

[60]  Ilias Diakonikolas,et al.  Sample-Optimal Identity Testing with High Probability , 2017, Electron. Colloquium Comput. Complex..

[61]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[62]  Oded Goldreich,et al.  Introduction to Property Testing , 2017 .

[63]  Gregory Valiant,et al.  The Power of Linear Estimators , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.