论文信息 - Agnostically learning halfspaces

Agnostically learning halfspaces

We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is given access to labeled examples drawn from a distribution, without restriction on the labels (e.g. adversarial noise). The algorithm constructs a hypothesis whose error rate on future examples is within an additive /spl epsi/ of the optimal halfspace, in time poly(n) for any constant /spl epsi/ > 0, under the uniform distribution over {-1, 1}/sup n/ or the unit sphere in /spl Ropf//sup n/ , as well as under any log-concave distribution over /spl Ropf/ /sup n/. It also agnostically learns Boolean disjunctions in time 2/sup O~(/spl radic/n)/ with respect to any distribution. The new algorithm, essentially L/sub 1/ polynomial regression, is a noise-tolerant arbitrary distribution generalization of the "low degree" Fourier algorithm of Linial, Mansour, & Nisan. We also give a new algorithm for PAC learning halfspaces under the uniform distribution on the unit sphere with the current best bounds on tolerable rate of "malicious noise".

[1] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[2] Pavel Pudlák,et al. Threshold circuits of bounded depth , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[3] Ming Li,et al. Learning in the presence of malicious errors , 1993, STOC '88.

[4] Noam Nisan,et al. Constant depth circuits, Fourier transform, and learnability , 1989, 30th Annual Symposium on Foundations of Computer Science.

[5] D. Clark,et al. Estimates of the Hermite and the Freud polynomials , 1990 .

[6] Eric B. Baum,et al. The Perceptron Algorithm is Fast for Nonmalicious Distributions , 1990, Neural Computation.

[7] Yuh-Dauh Lyuu,et al. The Transition to Perfect Generalization in Perceptrons , 1991, Neural Computation.

[8] Noam Nisan,et al. On the degree of boolean functions as real polynomials , 1992, STOC '92.

[9] Ramamohan Paturi,et al. On the degree of polynomials that approximate symmetric Boolean functions (preliminary version) , 1992, STOC '92.

[10] Michael Kearns,et al. Efficient noise-tolerant learning from statistical queries , 1993, STOC.

[11] Scott E. Decatur. Statistical queries and faulty PAC oracles , 1993, COLT '93.

[12] Yishay Mansour,et al. Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[13] Jeffrey C. Jackson,et al. An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[14] Philip M. Long. On the sample complexity of PAC learning half-spaces against the uniform distribution , 1995, IEEE Trans. Neural Networks.

[15] Peter L. Bartlett,et al. On efficient agnostic learning of linear combinations of basis functions , 1995, COLT '95.

[16] Robert E. Schapire,et al. On the Sample Complexity of Weakly Learning , 1995, Inf. Comput..

[17] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[18] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.

[19] Nader H. Bshouty,et al. On the Fourier spectrum of monotone functions , 1996, JACM.

[20] J. C. Jackson. The harmonic sieve: a novel application of Fourier analysis to machine learning theory and practice , 1996 .

[21] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[22] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[23] Alan M. Frieze,et al. A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.

[24] Yishay Mansour,et al. Learning Conjunctions with Noise under Product Distributions , 1998, Inf. Process. Lett..

[25] Rocco A. Servedio,et al. Boosting and hard-core sets , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[26] Rocco A. Servedio,et al. On PAC learning using Winnow, Perceptron, and a Perceptron-like algorithm , 1999, COLT '99.

[27] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .

[28] V. Zinoviev,et al. Codes on euclidean spheres , 2001 .

[29] Adam R. Klivans,et al. Learnability Beyond AC 0 , 2002 .

[30] Rocco A. Servedio,et al. Learnability beyond AC0 , 2002, STOC '02.

[31] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .

[32] Philip M. Long. An upper bound on the sample complexity of PAC-learning halfspaces with respect to the uniform distribution , 2003, Inf. Process. Lett..

[33] Ryan O'Donnell,et al. New degree bounds for polynomial threshold functions , 2003, STOC '03.

[34] Santosh S. Vempala,et al. Logconcave functions: geometry and efficient sampling algorithms , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[35] Adam Tauman Kalai,et al. Noise-tolerant learning, the parity problem, and the statistical query model , 2000, STOC '00.

[36] Adam R. Klivans,et al. Learning intersections and thresholds of halfspaces , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[37] R. Schapire,et al. Toward efficient agnostic learning , 1992, COLT '92.

[38] Rocco A. Servedio,et al. Agnostically Learning Halfspaces , 2005, FOCS.

[39] Oded Regev,et al. On lattices, learning with errors, random linear codes, and cryptography , 2005, STOC '05.

[40] K. Clarkson. Subgradient and sampling algorithms for l1 regression , 2005, SODA '05.

[41] Rene F. Swarttouw,et al. Orthogonal polynomials , 2020, NIST Handbook of Mathematical Functions.

[42] Prasad Raghavendra,et al. Hardness of Learning Halfspaces with Noise , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[43] Vitaly Feldman,et al. New Results for Learning Noisy Parities and Halfspaces , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[44] Vitaly Feldman. Optimal hardness results for maximizing agreements with monomials , 2006, 21st Annual IEEE Conference on Computational Complexity (CCC'06).