论文信息 - Learning Geometric Concepts via Gaussian Surface Area

Learning Geometric Concepts via Gaussian Surface Area

We study the learnability of sets in Ropfn under the Gaussian distribution, taking Gaussian surface area as the "complexity measure" of the sets being learned. Let CS denote the class of all (measurable) sets with surface area at most S. We first show that the class CS is learnable to any constant accuracy in time nO(S 2 ), even in the arbitrary noise ("agnostic'') model. Complementing this, we also show that any learning algorithm for CS information-theoretically requires 2Omega(S 2 ) examples for learning to constant accuracy. These results together show that Gaussian surface area essentially characterizes the computational complexity of learning under the Gaussian distribution. Our approach yields several new learning results, including the following (all bounds are for learning to any constant accuracy): The class of all convex sets can be agnostically learned in time 2O ~ (radicn) (and we prove a 2Omega(radicn) lower bound for noise-free learning). This is the first subexponential time algorithm for learning general convex sets even in the noise-free (PAC) model. Intersections of k halfspaces can be agnostically learned in time nO(log k) (cf. Vempala's nO(k) time algorithm for learning in the noise-free model).Cones (with apex centered at the origin), and spheres witharbitrary radius and center, can be agnostically learned in time poly(n).

Ryan O'Donnell | Rocco A. Servedio | Adam R. Klivans | R. O'Donnell | R. Servedio

[1] Philip M. Long. On the sample complexity of PAC learning half-spaces against the uniform distribution , 1995, IEEE Trans. Neural Networks.

[2] Santosh S. Vempala,et al. The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[3] H. Balsters,et al. Learnability with respect to fixed distributions , 1991 .

[4] Sergey G. Bobkov,et al. On Gaussian and Bernoulli covariance representations , 2001 .

[5] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[6] Gilles Zémor,et al. Discrete Isoperimetric Inequalities and the Probability of a Decoding Error , 2000, Combinatorics, Probability and Computing.

[7] Eric B. Baum,et al. The Perceptron Algorithm is Fast for Nonmalicious Distributions , 1990, Neural Computation.

[8] Neil D. Lawrence,et al. Semi-supervised Learning via Gaussian Processes , 2004, NIPS.

[9] Keith Ball. The reverse isoperimetric problem for Gaussian measure , 1993, Discret. Comput. Geom..

[10] William Feller,et al. An Introduction to Probability Theory and Its Applications , 1967 .

[11] F. Nazarov. On the Maximal Perimeter of a Convex Set in $ ℝ n $$\mathbb{R}^n$ with Respect to a Gaussian Measure , 2003 .

[12] P. Patnaik. The Non-central X^2- and F- distribution and Their Applications , 1949 .

[13] Y. Peres. Noise Stability of Weighted Majority , 2004, math/0412377.

[14] M. Talagrand. Isoperimetry, logarithmic sobolev inequalities on the discrete cube, and margulis' graph connectivity theorem , 1993 .

[15] V. Bentkus. On the dependence of the Berry–Esseen bound on dimension , 2003 .

[16] D. Bakry. L'hypercontractivité et son utilisation en théorie des semigroupes , 1994 .

[17] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[18] S. Bobkov. An isoperimetric inequality on the discrete cube, and an elementary proof of the isoperimetric inequality in Gauss space , 1997 .

[19] Philip M. Long. Halfspace Learning, Linear Programming, and Nonmalicious Distributions , 1994, Inf. Process. Lett..

[20] Shai Ben-David,et al. On the difficulty of approximately maximizing agreements , 2000, J. Comput. Syst. Sci..

[21] Alon Itai,et al. Learnability by fixed distributions , 1988, COLT '88.

[22] Alexander A. Sherstov,et al. Cryptographic Hardness for Learning Intersections of Halfspaces , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[23] Santosh S. Vempala,et al. A random sampling based algorithm for learning the intersection of half-spaces , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[24] I. Benjamini,et al. Noise sensitivity of Boolean functions and applications to percolation , 1998, math/9811157.

[25] V. Sudakov,et al. Extremal properties of half-spaces for spherically invariant measures , 1978 .

[26] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.

[27] Nader H. Bshouty,et al. Maximizing Agreements with One-Sided Error with Applications to Heuristic Learning , 2005, Machine Learning.

[28] C. Borell. The Brunn-Minkowski inequality in Gauss space , 1975 .

[29] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[30] M. Talagrand,et al. Probability in Banach spaces , 1991 .

[31] S. Bobkov,et al. Discrete isoperimetric and Poincaré-type inequalities , 1999 .

[32] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .

[33] Leslie G. Valiant,et al. A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[34] Avrim Blum,et al. Learning an Intersection of a Constant Number of Halfspaces over a Uniform Distribution , 1997, J. Comput. Syst. Sci..

[35] J. Lindenstrauss,et al. Geometric Aspects of Functional Analysis , 1987 .

[36] ERIC B. BAUM,et al. On learning a union of half spaces , 1990, J. Complex..

[37] Rocco A. Servedio,et al. Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[38] S. Janson. Gaussian Hilbert Spaces , 1997 .

[39] Rocco A. Servedio,et al. Learning intersections and thresholds of halfspaces , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[40] P. Patnaik. THE NON-CENTRAL χ2- AND F-DISTRIBUTIONS AND THEIR APPLICATIONS , 1949 .

[41] Nader H. Bshouty,et al. On the Fourier spectrum of monotone functions , 1996, JACM.

[42] M. Ledoux. Semigroup proofs of the isoperimetric inequality in Euclidean and Gauss space , 1994 .

[43] V. Rich. Personal communication , 1989, Nature.

[44] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[45] Rocco A. Servedio,et al. Learning intersections of halfspaces with a margin , 2004, J. Comput. Syst. Sci..

[46] L. Gross. LOGARITHMIC SOBOLEV INEQUALITIES. , 1975 .

[47] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[48] Michel Talagrand,et al. How much are increasing sets positively correlated? , 1996, Comb..

[49] Christian Houdré,et al. Some Connections Between Isoperimetric and Sobolev-Type Inequalities , 1997 .

[50] William Feller,et al. An Introduction to Probability Theory and Its Applications , 1951 .

[51] G. Pisier. Probabilistic methods in the geometry of Banach spaces , 1986 .

[52] Stephen Kwek,et al. PAC Learning Intersections of Halfspaces with Membership Queries , 1998, Algorithmica.