Learning Polynomials with Queries: The Highly Noisy Case

Given a function f mapping n-variate inputs from a finite field F into F, we consider the task of reconstructing a list of all n-variate degree d polynomials that agree with f on a tiny but nonnegligible fraction, $\delta$, of the input space. We give a randomized algorithm for solving this task. The algorithm accesses f as a black box and runs in time polynomial in ${\frac{n}\d}$ and exponential in d, provided $\delta$ is $\Omega(\sqrt{d/|F|})$. For the special case when d = 1, we solve this problem for all $\epsilon\eqdef\delta - \frac1{|F|} >0$. In this case the running time of our algorithm is bounded by a polynomial in $\frac1\e$ and $n$. Our algorithm generalizes a previously known algorithm, due to Goldreich and Levin [in Proceedings of the 21st Annual ACM Symposium on Theory of Computing, Seattle, WA, ACM Press, New York, 1989, pp. 25--32.], that solves this task for the case when F = GF(2) (and d = 1). In the process we provide new bounds on the number of degree $d$ polynomials that may agree with any given function on $\d \geq \sqrt{d/|F|}$ fraction of the inputs. This result is derived by generalizing a well-known bound from coding theory on the number of codewords from an error-correcting code that can be "close" to an arbitrary word; our generalization works for codes over arbitrary alphabets, while the previous result held only for binary alphabets.

[1]  Leonid A. Levin,et al.  A hard-core predicate for all one-way functions , 1989, STOC '89.

[2]  Madhu Sudan,et al.  Improved Low-Degree Testing and its Applications , 1997, STOC '97.

[3]  Ronitt Rubinfeld On the Robustness of Functional Equations , 1999, SIAM J. Comput..

[4]  Pascal Koiran,et al.  Efficient learning of continuous neural networks , 1994, COLT '94.

[5]  Ronitt Rubinfeld,et al.  Learning fallible finite state automata , 1993, COLT '93.

[6]  Richard J. Lipton,et al.  Cryptographic Primitives Based on Hard Learning Problems , 1993, CRYPTO.

[7]  Richard J. Lipton,et al.  A Probabilistic Remark on Algebraic Program Testing , 1978, Inf. Process. Lett..

[8]  Madhu Sudan,et al.  Decoding of Reed Solomon Codes beyond the Error-Correction Bound , 1997, J. Complex..

[9]  Ronitt Rubinfeld,et al.  Robust Characterizations of Polynomials with Applications to Program Testing , 1996, SIAM J. Comput..

[10]  Richard Zippel,et al.  Effective polynomial computation , 1993, The Kluwer international series in engineering and computer science.

[11]  Richard J. Lipton,et al.  New Directions In Testing , 1989, Distributed Computing And Cryptography.

[12]  Philip D. Laird,et al.  Learning from good data and bad , 1987 .

[13]  Madhu Sudan,et al.  Highly Resilient Correctors for Polynomials , 1992, Inf. Process. Lett..

[14]  Rani Siromoney,et al.  A noise model on learning sets of strings , 1992, COLT '92.

[15]  Manuel Blum,et al.  Self-testing/correcting with applications to numerical problems , 1990, STOC '90.

[16]  Yasubumi Sakakibara,et al.  On Learning from Queries and Counterexamples in the Presence of Noise , 1991, Inf. Process. Lett..

[17]  Richard Zippel,et al.  Interpolating Polynomials from Their Values , 1990, J. Symb. Comput..

[18]  Avrim Blum,et al.  Learning switching concepts , 1992, COLT '92.

[19]  Venkatesan Guruswami,et al.  Improved decoding of Reed-Solomon and algebraic-geometry codes , 1999, IEEE Trans. Inf. Theory.

[20]  Eyal Kushilevitz,et al.  Learning Decision Trees Using the Fourier Spectrum , 1993, SIAM J. Comput..

[21]  Vijay V. Vazirani,et al.  Efficient and Secure Pseudo-Random Number Generation (Extended Abstract) , 1984, FOCS.

[22]  Ming Li,et al.  Learning in the Presence of Malicious Errors , 1993, SIAM J. Comput..

[23]  Yishay Mansour,et al.  Randomized Interpolation and Approximation of Sparse Polynomials , 1992, SIAM J. Comput..

[24]  Joan Feigenbaum,et al.  Hiding Instances in Multioracle Queries , 1990, STACS.

[25]  Robert H. Sloan,et al.  Corrigendum to types of noise in data for concept learning , 1988, COLT '92.

[26]  Richard Zippel,et al.  Probabilistic algorithms for sparse polynomials , 1979, EUROSAM.

[27]  Robert E. Schapire,et al.  Exact Identification of Read-Once Formulas Using Fixed Points of Amplification Functions , 1993, SIAM J. Comput..

[28]  Hal Wasserman,et al.  Reconstructing randomly sampled multivariate polynomials from highly noisy data , 1998, SODA '98.

[29]  M. Sudan,et al.  Robust Characterizations of Polynomials and Their Applications to Program Testing , 1993 .

[30]  Luca Trevisan,et al.  Pseudorandom generators without the XOR Lemma , 1999, Electron. Colloquium Comput. Complex..

[31]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[32]  Wolfgang Maass,et al.  Agnostic PAC Learning of Functions on Analog Neural Nets , 1993, Neural Computation.

[33]  Dana Ron,et al.  Learning to model sequences generated by switching distributions , 1995, COLT '95.

[34]  Ronitt Rubinfeld,et al.  Reconstructing Algebraic Functions from Mixed Data , 1998, SIAM J. Comput..

[35]  Ronitt Rubinfeld,et al.  Learning fallible Deterministic Finite Automata , 1995, Machine Learning.

[36]  Ronitt Rubinfeld,et al.  Learning polynomials with queries: The highly noisy case , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[37]  Jacob T. Schwartz,et al.  Fast Probabilistic Algorithms for Verification of Polynomial Identities , 1980, J. ACM.

[38]  Wolfgang Maass,et al.  Efficient agnostic PAC-learning with simple hypothesis , 1994, COLT '94.