Learning with Queries Corrupted by Classification Noise

Abstract Kearns introduced the “statistical query” (SQ) model as a general method for producing learning algorithms which are robust against classification noise. We extend this approach in several ways in order to tackle algorithms that use “membership queries”, focusing on the more stringent model of “persistent noise”. The main ingredients in the general analysis are: 1. Smallness of dimension of the classes of both the target and the queries. 2. Independence of the noise variables. Persistence restricts independence, forcing repeated invocation of the same point x to give the same label. We apply the general analysis to get a noise-robust version of Jackson's Harmonic Sieve, which learns DNF under the uniform distribution. This corrects an error in his earlier analysis of noise tolerant DNF learning.

[1]  Jehoshua Bruck,et al.  Harmonic Analysis of Polynomial Threshold Functions , 1990, SIAM J. Discret. Math..

[2]  Michael Kharitonov,et al.  Cryptographic lower bounds for learnability of Boolean functions on the uniform distribution , 1992, COLT '92.

[3]  Javed A. Aslam,et al.  General bounds on statistical query learning and PAC learning with noise via hypothesis boosting , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[4]  Eli Shamir,et al.  Learning by extended statistical queries and its relation to PAC learning , 1995, EuroCOLT.

[5]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[6]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[7]  Eyal Kushilevitz,et al.  Learning Decision Trees Using the Fourier Spectrum , 1993, SIAM J. Comput..

[8]  Nader H. Bshouty,et al.  Exact learning via the Monotone theory , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[9]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[10]  Michael Kharitonov,et al.  Cryptographic hardness of distribution-specific learning , 1993, STOC.

[11]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[12]  D. Pollard Convergence of stochastic processes , 1984 .

[13]  J. C. Jackson The harmonic sieve: a novel application of Fourier analysis to machine learning theory and practice , 1996 .

[14]  Yishay Mansour,et al.  Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[15]  Leonid A. Levin,et al.  A hard-core predicate for all one-way functions , 1989, STOC '89.

[16]  Yoav Freund,et al.  Data filtering and distribution modeling algorithms for machine learning , 1993 .

[17]  Linda Sellie,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[18]  Robert E. Schapire,et al.  Exact Identification of Read-Once Formulas Using Fixed Points of Amplification Functions , 1993, SIAM J. Comput..

[19]  Michael Kearns,et al.  Efficient noise-tolerant learning from statistical queries , 1993, STOC.