论文信息 - Time-space lower bounds for two-pass learning

Time-space lower bounds for two-pass learning

A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [11, 7, 12, 9, 2, 5]. For example, any algorithm for learning parities of size n requires either a memory of size Ω(n2) or an exponential number of samples [11]. All these works modeled the learner as a one-pass branching program, allowing only one pass over the stream of samples. In this work, we prove the first memory-samples lower bounds (with a super-linear lower bound on the memory size and super-polynomial lower bound on the number of samples) when the learner is allowed two passes over the stream of samples. For example, we prove that any two-pass algorithm for learning parities of size n requires either a memory of size Ω(n1.5) or at least 2Ω([EQUATION]) samples. More generally, a matrix M : A x X → {-1, 1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1, b1), (a2, b2) ..., where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai, x). Assume that k, l, r are such that any submatrix of M of at least 2-k · |A| rows and at least 2-l · |X| columns, has a bias of at most 2-r. We show that any two-pass learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω(k · min[EQUATION]), or at least 2Ω(min[EQUATION]) samples.

Ran Raz | Avishay Tal | Sumegha Garg

[1] Miklos Santha,et al. Generating Quasi-Random Sequences from Slightly-Random Sources (Extended Abstract) , 1984, FOCS.

[2] Ran Raz,et al. Extractor-based time-space lower bounds for learning , 2017, Electron. Colloquium Comput. Complex..

[3] Ran Raz,et al. Interactive channel capacity , 2013, STOC '13.

[4] Ran Raz,et al. Time-space hardness of learning sparse parities , 2017, Electron. Colloquium Comput. Complex..

[5] Dana Moshkovitz,et al. Entropy Samplers and Strong Generic Lower Bounds For Space Bounded Learning , 2018, ITCS.

[6] Ohad Shamir,et al. Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation , 2013, NIPS.

[7] Oded Goldreich,et al. Unbiased Bits from Sources of Weak Randomness and Probabilistic Communication Complexity , 1988, SIAM J. Comput..

[8] O. Shamir,et al. L G ] 6 J un 2 01 8 Detecting Correlations with Little Memory and Communication , 2018 .

[9] Gregory Valiant,et al. Information Theoretically Secure Databases , 2016, Electron. Colloquium Comput. Complex..

[10] Ran Raz. Fast Learning Requires Good Memory : A Time-Space Lower Bound for Parity Learning , 2018 .

[11] David A. Mix Barrington,et al. Bounded-width polynomial-size branching programs recognize exactly those languages in NC1 , 1986, STOC '86.

[12] Ran Raz,et al. A Time-Space Lower Bound for a Large Class of Learning Problems , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[13] Gregory Valiant,et al. Memory, Communication, and Statistical Queries , 2016, COLT.

[14] Dana Moshkovitz,et al. Mixing Implies Lower Bounds for Space Bounded Learning , 2017, COLT.

[15] Naftali Tishby,et al. Mixing Complexity and its Applications to Neural Networks , 2017, ArXiv.

[16] Xin Yang,et al. Time-Space Tradeoffs for Learning Finite Functions from Random Evaluations, with Applications to Polynomials , 2018, Electron. Colloquium Comput. Complex..