Monotone Probability Distributions over the Boolean Cube Can Be Learned with Sublinear Samples

A probability distribution over the Boolean cube is monotone if flipping the value of a coordinate from zero to one can only increase the probability of an element. Given samples of an unknown monotone distribution over the Boolean cube, we give (to our knowledge) the first algorithm that learns an approximation of the distribution in statistical distance using a number of samples that is sublinear in the domain. To do this, we develop a structural lemma describing monotone probability distributions. The structural lemma has further implications to the sample complexity of basic testing tasks for analyzing monotone probability distributions over the Boolean cube: We use it to give nontrivial upper bounds on the tasks of estimating the distance of a monotone distribution to uniform and of estimating the support size of a monotone distribution. In the setting of monotone probability distributions over the Boolean cube, our algorithms are the first to have sample complexity lower than known lower bounds for the same testing tasks on arbitrary (not necessarily monotone) probability distributions. One further consequence of our learning algorithm is an improved sample complexity for the task of testing whether a distribution on the Boolean cube is monotone.

[1]  Jerry Li,et al.  Fast and Sample Near-Optimal Algorithms for Learning Multidimensional Histograms , 2018, COLT.

[2]  Daniel M. Kane,et al.  Optimal Learning via the Fourier Transform for Sums of Independent Integer Random Variables , 2015, COLT.

[3]  Ronitt Rubinfeld,et al.  Testing Shape Restrictions of Discrete Distributions , 2015, Theory of Computing Systems.

[4]  R. Servedio,et al.  Testing monotone high-dimensional distributions , 2009 .

[5]  Aravindan Vijayaraghavan,et al.  On Learning Mixtures of Well-Separated Gaussians , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[6]  Christos Tzamos,et al.  On the Structure, Covering, and Learning of Poisson Multinomial Distributions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[7]  Qingqing Huang,et al.  Learning Mixtures of Gaussians in High Dimensions , 2015, STOC.

[8]  Constantinos Daskalakis,et al.  Optimal Testing for Properties of Distributions , 2015, NIPS.

[9]  Gregory Valiant,et al.  The Power of Linear Estimators , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[10]  Liam Paninski,et al.  A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data , 2008, IEEE Transactions on Information Theory.

[11]  Ronitt Rubinfeld,et al.  Towards Testing Monotonicity of Distributions Over General Posets , 2019, COLT.

[12]  Gregory Valiant,et al.  Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs , 2011, STOC '11.

[13]  Constantinos Daskalakis,et al.  Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians , 2013, COLT.

[14]  Ilias Diakonikolas,et al.  Collision-based Testers are Optimal for Uniformity and Closeness , 2016, Electron. Colloquium Comput. Complex..

[15]  L. Birge Estimating a Density under Order Restrictions: Nonasymptotic Minimax Risk , 1987 .

[16]  Daniel M. Kane,et al.  Properly Learning Poisson Binomial Distributions in Almost Polynomial Time , 2015, COLT.

[17]  Artur Czumaj,et al.  Testing Monotone Continuous Distributions on High-Dimensional Real Cubes , 2010, Property Testing.

[18]  Ronitt Rubinfeld,et al.  Sublinear algorithms for testing monotone and unimodal distributions , 2004, STOC '04.

[19]  Rocco A. Servedio,et al.  Learning mixtures of structured distributions over discrete domains , 2012, SODA.

[20]  Ronitt Rubinfeld,et al.  Approximating and testing k-histogram distributions in sub-linear time , 2012, PODS '12.

[21]  Rocco A. Servedio,et al.  On DNF Approximators for Monotone Boolean Functions , 2014, ICALP.

[22]  Ronitt Rubinfeld,et al.  Testing monotonicity of distributions over general partial orders , 2011, ICS.

[23]  Daniel M. Kane,et al.  Testing Identity of Structured Distributions , 2014, SODA.

[24]  Adam Tauman Kalai,et al.  Efficiently learning mixtures of two Gaussians , 2010, STOC '10.

[25]  Daniel M. Kane,et al.  Testing Bayesian Networks , 2016, IEEE Transactions on Information Theory.

[26]  Yihong Wu,et al.  Chebyshev polynomials, moment matching, and optimal estimation of the unseen , 2015, The Annals of Statistics.

[27]  Ilias Diakonikolas,et al.  Optimal Algorithms for Testing Closeness of Discrete Distributions , 2013, SODA.

[28]  Rocco A. Servedio,et al.  Testing k-Modal Distributions: Optimal Algorithms via Reductions , 2011, SODA.

[29]  Dana Ron,et al.  Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[30]  Rocco A. Servedio,et al.  Learning Poisson Binomial Distributions , 2011, STOC '12.

[31]  Rocco A. Servedio,et al.  Learning k-Modal Distributions via Testing , 2011, Theory Comput..

[32]  Daniel M. Kane,et al.  Efficient Robust Proper Learning of Log-concave Distributions , 2016, ArXiv.

[33]  Yihong Wu,et al.  Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation , 2014, IEEE Transactions on Information Theory.

[34]  Paul Valiant Testing symmetric properties of distributions , 2008, STOC '08.