The power of online thinning in reducing discrepancy

Consider an infinite sequence of independent, uniformly chosen points from $$[0,1]^d$$[0,1]d. After looking at each point in the sequence, an overseer is allowed to either keep it or reject it, and this choice may depend on the locations of all previously kept points. However, the overseer must keep at least one of every two consecutive points. We call a sequence generated in this fashion a two-thinning sequence. Here, the purpose of the overseer is to control the discrepancy of the empirical distribution of points, that is, after selecting n points, to reduce the maximal deviation of the number of points inside any axis-parallel hyper-rectangle of volume A from nA. Our main result is an explicit low complexity two-thinning strategy which guarantees discrepancy of $$O(\log ^{2d+1} n)$$O(log2d+1n) for all n with high probability [compare with $$\Theta (\sqrt{n\log \log n})$$Θ(nloglogn) without thinning]. The case $$d=1$$d=1 of this result answers a question of Benjamini. We also extend the construction to achieve the same asymptotic bound for ($$1+\beta $$1+β)-thinning, a set-up in which rejecting is only allowed with probability $$\beta $$β independently for each point. In addition, we suggest an improved and simplified strategy which we conjecture to guarantee discrepancy of $$O(\log ^{d+1} n)$$O(logd+1n) [compare with $$\theta (\log ^d n)$$θ(logdn), the best known construction of a low discrepancy sequence]. Finally, we provide theoretical and empirical evidence for our conjecture, and provide simulations supporting the viability of our construction for applications.

[1]  Kunal Talwar,et al.  Graphical balanced allocations and the (1 + β)‐choice process , 2015, Random Struct. Algorithms.

[2]  Ronald Pyke The Asymptotic Behavior of Spacings Under Kakutani's Model for Interval Subdivision , 1980 .

[3]  S. Kakutani A problem of equidistribution on the unit interval [0, 1] , 1976 .

[4]  W. R. van Zwet,et al.  A PROOF OF KAKUTANI'S CONJECTURE ON RANDOM SUBDIVISION OF LONGEST INTERVALS , 1978 .

[5]  Kunal Talwar,et al.  Balanced Allocations: A Simple Proof for the Heavily Loaded Case , 2013, ICALP.

[6]  Bernard Chazelle,et al.  The discrepancy method - randomness and complexity , 2000 .

[7]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[8]  F. Pillichshammer,et al.  Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration , 2010 .

[9]  K. F. Roth On irregularities of distribution , 1954 .

[10]  Weak convergence results for the Kakutani interval splitting procedure , 2004 .

[11]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[12]  Alan H. McGowan,et al.  Choices , 2014 .

[13]  M. Donsker Justification and Extension of Doob's Heuristic Approach to the Kolmogorov- Smirnov Theorems , 1952 .

[14]  Eli Upfal,et al.  Balanced Allocations , 1999, SIAM J. Comput..

[15]  Ramesh K. Sitaraman,et al.  The power of two random choices: a survey of tech-niques and results , 2001 .

[16]  J. Lootgieter Sur la répartition des suites de Kakutani (II) , 1977 .

[17]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[18]  Pascal Maillard,et al.  Choices and intervals , 2014, 1402.3931.

[19]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[20]  M. Lacey,et al.  On the Small Ball Inequality in All Dimensions , 2007, 0705.4619.

[21]  M. Junge Choices, intervals and equidistribution , 2014, 1410.6537.

[22]  M. Skriganov Harmonic analysis on totally disconnected groups and irregularities of point distributions , 2006 .

[23]  Frances Y. Kuo,et al.  High-dimensional integration: The quasi-Monte Carlo way*† , 2013, Acta Numerica.

[24]  William W. L. Chen,et al.  Davenport's Theorem in the Theory of Irregularities of Point Distribution , 2003 .

[25]  M. Skriganov,et al.  Explicit constructions in the classical mean squares problem in irregularities of point distribution , 2002 .

[26]  Art B. Owen,et al.  Statistically Efficient Thinning of a Markov Chain Sampler , 2015, ArXiv.

[27]  Berthold Vöcking,et al.  Balanced Allocations: The Heavily Loaded Case , 2006, SIAM J. Comput..

[28]  F. Pillichshammer,et al.  Discrepancy Theory and Quasi-Monte Carlo Integration , 2014 .

[29]  D. Bilyk Roth’s Orthogonal Function Method in Discrepancy Theory and Some New Connections , 2014 .