PAC Learning Intersections of Halfspaces with Membership Queries

Abstract. A randomized learning algorithm {POLLY} is presented that efficiently learns intersections of s halfspaces in n dimensions, in time polynomial in both s and n . The learning protocol is the PAC (probably approximately correct) model of Valiant, augmented with membership queries. In particular, {POLLY} receives a set S of m = poly(n,s,1/ε,1/δ) randomly generated points from an arbitrary distribution over the unit hypercube, and is told exactly which points are contained in, and which points are not contained in, the convex polyhedron P defined by the halfspaces. {POLLY} may also obtain the same information about points of its own choosing. It is shown that after poly(n , s , 1/ε , 1/δ , log(1/d) ) time, the probability that {POLLY} fails to output a collection of s halfspaces with classification error at most ε , is at most δ . Here, d is the minimum distance between the boundary of the target and those examples in S that are not lying on the boundary. The parameter log(1/d) can be bounded by the number of bits needed to encode the coefficients of the bounding hyperplanes and the coordinates of the sampled examples S . Moreover, {POLLY} can be extended to learn unions of k disjoint polyhedra with each polyhedron having at most s facets, in time poly(n , k , s , 1/ε , 1/δ , log(1/d) , 1/γ ) where γ is the minimum distance between any two distinct polyhedra.

[1]  Manfred K. Warmuth,et al.  Efficient Learning With Virtual Threshold Gates , 1995, Inf. Comput..

[2]  ERIC B. BAUM,et al.  On learning a union of half spaces , 1990, J. Complex..

[3]  S. Skiena Interactive reconstruction via geometric probing , 1992, Proc. IEEE.

[4]  Stephen Kwek,et al.  Learning of depth two neural networks with constant fan-in at the hidden nodes (extended abstract) , 1996, COLT '96.

[5]  Wolfgang Maass,et al.  Fast identification of geometric objects with membership queries , 1991, COLT '91.

[6]  Paul Fischer,et al.  More or less efficient agnostic learning of convex polygons , 1995, COLT '95.

[7]  Avrim Blum,et al.  Learning an intersection of k halfspaces over a uniform distribution , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[8]  Renato D. C. Monteiro,et al.  Interior path following primal-dual algorithms. part I: Linear programming , 1989, Math. Program..

[9]  R. Schapire,et al.  Toward Efficient Agnostic Learning , 1994 .

[10]  Stephen Kwek,et al.  Minimizing Disagreements for Geometric Regions Using Dynamic Programming , with Applications to Machine Learning and Computer Graphics , 1996 .

[11]  Vijay Raghavan,et al.  On the Limits of Proper Learnability of Subclasses of DNF Formulas , 1994, COLT '94.

[12]  Eric B. Baum,et al.  A Polynomial Time Algorithm That Learns Two Hidden Unit Nets , 1990, Neural Computation.

[13]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[14]  Pravin M. Vaidya,et al.  An algorithm for linear programming which requires O(((m+n)n2+(m+n)1.5n)L) arithmetic operations , 1987, Math. Program..

[15]  Philip M. Long,et al.  Composite geometric concepts and polynomial predictability , 1990, COLT '90.

[16]  Alan M. Frieze,et al.  A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.

[17]  Dimitrios Gunopulos,et al.  Computing the Maximum Bichromatic Discrepancy with Applications to Computer Graphics and Machine Learning , 1996, J. Comput. Syst. Sci..

[18]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[19]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, STOC '84.

[20]  Eric B. Baum,et al.  Neural net algorithms that learn in polynomial time from examples and queries , 1991, IEEE Trans. Neural Networks.

[21]  Foued Ameur A space-bounded learning algorithm for axis-parallel rectangles , 1995, EuroCOLT.

[22]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[23]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[24]  R. Schapire Toward Eecient Agnostic Learning , 1992 .

[25]  Leonard Pitt,et al.  Prediction-Preserving Reducibility , 1990, J. Comput. Syst. Sci..

[26]  James Renegar,et al.  A polynomial-time algorithm, based on Newton's method, for linear programming , 1988, Math. Program..

[27]  Lisa Hellerstein,et al.  Complexity theoretic hardness results for query learning , 1998, computational complexity.

[28]  Tibor Hegedűs,et al.  Geometrical concept learning and convex polytopes , 1994, COLT 1994.

[29]  Wolfgang Maass,et al.  How fast can a threshold gate learn , 1994, COLT 1994.

[30]  Donna K. Slonim,et al.  Learning with unreliable boundary queries , 1995, COLT '95.

[31]  Dimitrios Gunopulos,et al.  Concept learning with geometric hypotheses , 1995, COLT '95.

[32]  Nader H. Bshouty,et al.  Noise-tolerant parallel learning of geometric concepts , 1995, COLT '95.

[33]  Pravin M. Vaidya,et al.  An algorithm for linear programming which requires O(((m+n)n2+(m+n)1.5n)L) arithmetic operations , 1990, Math. Program..

[34]  Peter Auer,et al.  On-line learning of rectangles in noisy environments , 1993, COLT '93.

[35]  W. Maass,et al.  Eecient Learning with Virtual Threshold Gates , 1997 .

[36]  Wolfgang Maass,et al.  On the complexity of learning from counterexamples , 1989, 30th Annual Symposium on Foundations of Computer Science.

[37]  Shai Ben-David,et al.  A composition theorem for learning algorithms with applications to geometric concept classes , 1997, STOC '97.