Regularity, Boosting, and Efficiently Simulating Every High-Entropy Distribution

We show that every bounded function g: {0,1}^n -≫ [0,1] admits an efficiently computable "simulator" function h: {0,1}^n-≫[0,1] such that every fixed polynomial size circuit has approximately the same correlation with g as with h. If g describes (up to scaling) a high min-entropy distribution D, then h can be used to efficiently sample a distribution D' of the same min-entropy that is indistinguishable from D by circuits of fixed polynomial size. We state and prove our result in a more abstract setting, in which we allow arbitrary finite domains instead of {0,1}^n, and arbitrary families of distinguishers, instead of fixed polynomial size circuits. Our result implies (a) the Weak Szemeredi Regularity Lemma of Frieze and Kannan (b) a constructive version of the Dense Model Theorem of Green, Tao and Ziegler with better quantitative parameters (polynomial rather than exponential in the distinguishing probability), and (c) the Impagliazzo Hardcore Set Lemma. It appears to be the general result underlying the known connections between "regularity" results in graph theory, "decomposition" results in additive combinatorics, and the Hardcore Lemma in complexity theory. We present two proofs of our result, one in the spirit of Nisan's proof of the Hardcore Lemma via duality of linear programming, and one similar to Impagliazzo's "boosting" proof. A third proof by iterative partitioning, which gives the complexity of the sampler to be exponential in the distinguishing probability, is also implicit in the Green-Tao-Ziegler proofs of the Dense Model Theorem.

[1]  E. Szemerédi On sets of integers containing k elements in arithmetic progression , 1975 .

[2]  Andrew Chi-Chih Yao,et al.  Theory and Applications of Trapdoor Functions (Extended Abstract) , 1982, FOCS.

[3]  T. Tao,et al.  The primes contain arbitrarily long arithmetic progressions , 2004, math/0404188.

[4]  T. Tao,et al.  The primes contain arbitrarily long polynomial progressions , 2006, math/0610050.

[5]  Thomas Holenstein,et al.  Key agreement from weak bit agreement , 2005, STOC '05.

[6]  Luca Trevisan,et al.  Extracting randomness from samplable distributions , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[7]  Rocco A. Servedio,et al.  Boosting and Hard-Core Set Construction , 2003, Machine Learning.

[8]  Noam Nisan,et al.  On Yao's XOR-Lemma , 1995, Electron. Colloquium Comput. Complex..

[9]  Silvio Micali,et al.  The knowledge complexity of interactive proof-systems , 1985, STOC '85.

[10]  Spyridon Antonakopoulos,et al.  Buy-at-Bulk Network Design with Protection , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[11]  Terence Tao,et al.  Structure and Randomness in Combinatorics , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[12]  Russell Impagliazzo,et al.  Hard-core distributions for somewhat hard problems , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[13]  Madhur Tulsiani,et al.  Dense Subsets of Pseudorandom Sets , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[14]  Alan M. Frieze,et al.  Quick Approximation to Matrices and Applications , 1999, Comb..

[15]  W. T. Gowers,et al.  Decompositions, approximate structure, transference, and the Hahn–Banach theorem , 2008, 0811.3103.