Learning intersections and thresholds of halfspaces

We give the first polynomial time algorithm to learn any function of a constant number of halfspaces under the uniform distribution to within any constant error parameter. We also give the first quasipolynomial time algorithm for learning any function of a polylog number of polynomial-weight halfspaces under any distribution. As special cases of these results we obtain algorithms for learning intersections and thresholds of halfspaces. Our uniform distribution learning algorithms involve a novel non-geometric approach to learning halfspaces; we use Fourier techniques together with a careful analysis of the noise sensitivity of functions of halfspaces. Our algorithms for learning under any distribution use techniques from real approximation theory to construct low degree polynomial threshold functions.

[1]  Nader H. Bshouty,et al.  More efficient PAC-learning of DNF with membership queries under the uniform distribution , 2004, J. Comput. Syst. Sci..

[2]  Yishay Mansour,et al.  Learning Boolean Functions via the Fourier Transform , 1994 .

[3]  Akira Maruoka,et al.  Learning Monotone Log-Term DNF Formulas under the Uniform Distribution , 2000, Theory of Computing Systems.

[4]  Yishay Mansour,et al.  Learning monotone ku DNF formulas on product distributions , 1991, COLT '91.

[5]  Avrim Blum,et al.  Learning an Intersection of a Constant Number of Halfspaces over a Uniform Distribution , 1997, J. Comput. Syst. Sci..

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[7]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[8]  Santosh S. Vempala,et al.  A random sampling based algorithm for learning the intersection of half-spaces , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[9]  Yishay Mansour,et al.  An O(n^(log log n)) Learning Algorithm for DNT under the Uniform Distribution , 1995, J. Comput. Syst. Sci..

[10]  Jeffrey C. Jackson,et al.  An efficient membership-query algorithm for learning DNF with respect to the uniform distribution , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[11]  Thomas Raysor Hancock The complexity of learning formulas and decision trees that have restricted reads , 1992 .

[12]  Nader H. Bshouty,et al.  On the Fourier spectrum of monotone functions , 1995, STOC '95.

[13]  Noam Nisan,et al.  Constant depth circuits, Fourier transform, and learnability , 1993, JACM.

[14]  Wolfgang Maass,et al.  How fast can a threshold gate learn , 1994, COLT 1994.

[15]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[16]  Kathie Cameron,et al.  Monotone path systems in simple regions , 1994, Comb..

[17]  D. Newman Rational approximation to | x , 1964 .

[18]  Eyal Kushilevitz,et al.  On the applications of multiplicity automata in learning , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[19]  E. Cheney Introduction to approximation theory , 1966 .

[20]  Michael Kharitonov,et al.  Cryptographic hardness of distribution-specific learning , 1993, STOC.

[21]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[22]  Rocco A. Servedio,et al.  Learnability beyond AC0 , 2002, STOC '02.

[23]  Eric B. Baum,et al.  A Polynomial Time Algorithm That Learns Two Hidden Unit Nets , 1990, Neural Computation.

[24]  Rocco A. Servedio,et al.  On learning monotone DNF under product distributions , 2001, Inf. Comput..

[25]  Rocco A. Servedio,et al.  On PAC learning using Winnow, Perceptron, and a Perceptron-like algorithm , 1999, COLT '99.

[26]  H. D. Block The perceptron: a model for brain functioning. I , 1962 .

[27]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[28]  Rocco A. Servedio,et al.  Learning DNF in time , 2001, STOC '01.

[29]  Eric B. Baum,et al.  Neural net algorithms that learn in polynomial time from examples and queries , 1991, IEEE Trans. Neural Networks.

[30]  Thomas Kailath,et al.  Rational approximation techniques for analysis of neural networks , 1994, IEEE Trans. Inf. Theory.

[31]  Eyal Kushilevitz,et al.  A Simple Algorithm for Learning O (log n)-Term DNF , 1997, Inf. Process. Lett..

[32]  Roni Khardon On Using the Fourier Transform to Learn Disjoint DNF , 1994, Inf. Process. Lett..

[33]  Donna K. Slonim,et al.  Learning with unreliable boundary queries , 1995, COLT '95.

[34]  Noam Nisan,et al.  On the degree of boolean functions as real polynomials , 1992, STOC '92.

[35]  Mihir Bellare A technique for upper bounding the spectral norm with applications to learning , 1992, COLT '92.

[36]  M. An Log-Concave Probability Distributions: Theory and Statistical Testing , 1996 .

[37]  Yishay Mansour,et al.  An O(nlog log n) learning algorithm for DNF under the uniform distribution , 1992, COLT '92.

[38]  Ryan O'Donnell,et al.  Hardness amplification within NP , 2002, Proceedings 17th IEEE Annual Conference on Computational Complexity.

[39]  Donna K. Slonim,et al.  Learning with unreliable boundary queries , 1995, COLT '95.

[40]  Stephen Kwek,et al.  PAC Learning Intersections of Halfspaces with Membership Queries , 1998, Algorithmica.

[41]  Pavel Pudlák,et al.  Threshold circuits of bounded depth , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[42]  I. Benjamini,et al.  Noise sensitivity of Boolean functions and applications to percolation , 1998 .

[43]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[44]  Karsten A. Verbeurgt Learning Sub-classes of Monotone DNF on the Uniform Distribution , 1998, ALT.

[45]  Adam R. Klivans Learning Dnf in Time ¾ Ç´ò ½¿ Μ , 2001 .

[46]  Eyal Kushilevitz A simple algorithm for learning O(log n)-term DNF , 1996, COLT '96.

[47]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[48]  Karsten A. Verbeurgt Learning DNF under the uniform distribution in quasi-polynomial time , 1990, COLT '90.

[49]  James Aspnes,et al.  The expressive power of voting polynomials , 1994, Comb..

[50]  Rocco A. Servedio,et al.  Boosting and hard-core sets , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[51]  V. Statulevičius,et al.  Limit Theorems of Probability Theory , 2000 .

[52]  Stephen Kwek,et al.  Learning of depth two neural networks with constant fan-in at the hidden nodes (extended abstract) , 1996, COLT '96.

[53]  Prudence W. H. Wong,et al.  The 34th ACM Symposium on Theory of Computing , 2002 .

[54]  Richard J. Lipton,et al.  Amplification of weak learning under the uniform distribution , 1993, COLT '93.

[55]  Rocco A. Servedio,et al.  Smooth Boosting and Learning with Malicious Noise , 2001, J. Mach. Learn. Res..

[56]  Adam R. Klivans,et al.  Learning DNF in time 2 Õ(n 1/3 ) . , 2001, STOC 2001.

[57]  D. Angluin Queries and Concept Learning , 1988 .

[58]  Mario Marchand,et al.  On learning ?-perceptron networks on the uniform distribution , 1996, Neural Networks.

[59]  Yishay Mansour,et al.  Weakly learning DNF and characterizing statistical query learning using Fourier analysis , 1994, STOC '94.

[60]  Daniel A. Spielman,et al.  PP is closed under intersection , 1991, STOC '91.