论文信息 - Hardness of Learning Halfspaces with Noise

Hardness of Learning Halfspaces with Noise

Learning an unknown halfspace (also called a perceptron) from labeled examples is one of the classic problems in machine learning. In the noise-free case, when a halfspace consistent with all the training examples exists, the problem can be solved in polynomial time using linear programming. However, under the promise that a halfspace consistent with a fraction (1-\varepsilon ) of the examples exists (for some small constant \varepsilon > 0), it was not known how to efficiently find a halfspace that is correct on even 51% of the examples. Nor was a hardness result that ruled out getting agreement on more than 99.9% of the examples known. In this work, we close this gap in our understanding, and prove that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense. Specifically, for arbitrary \varepsilon, \delta \ge 0, we prove that given a set of examples-label pairs from the hypercube a fraction (1-\varepsilon ) of which can be explained by a halfspace, it is NP-hard to find a halfspace that correctly labels a fraction (1/2 + \delta ) of the examples. The hardness result is tight since it is trivial to get agreement on 1/2 the examples. In learning theory parlance, we prove that weak proper agnostic learning of halfspaces is hard. This settles a question that was raised by Blum et al in their work on learning halfspaces in the presence of random classification noise [7], and in some more recent works as well. Along the way, we also obtain a strong hardness for another basic computational problem: solving a linear system over the rationals.

Prasad Raghavendra | Venkatesan Guruswami

[1] Edith Cohen,et al. Learning noisy perceptrons by a perceptron in polynomial time , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[2] Uri Zwick,et al. Finding almost-satisfying assignments , 1998, STOC '98.

[3] Rocco A. Servedio,et al. Agnostically Learning Halfspaces , 2005, FOCS.

[4] Vitaly Feldman. Optimal hardness results for maximizing agreements with monomials , 2006, 21st Annual IEEE Conference on Computational Complexity (CCC'06).

[5] Edoardo Amaldi,et al. On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[6] Shai Ben-David,et al. On the difficulty of approximately maximizing agreements , 2000, J. Comput. Syst. Sci..

[7] Uriel Feige,et al. On the hardness of approximating Max-Satisfy , 2006, Inf. Process. Lett..

[8] Linda Sellie,et al. Toward efficient agnostic learning , 1992, COLT '92.

[9] Magnús M. Halldórsson,et al. Journal of Graph Algorithms and Applications Approximations of Weighted Independent Set and Hereditary Subset Problems , 2022 .

[10] Vitaly Feldman,et al. New Results for Learning Noisy Parities and Halfspaces , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[11] Johan Håstad,et al. Some optimal inapproximability results , 2001, JACM.

[12] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[13] Ran Raz. A Parallel Repetition Theorem , 1998, SIAM J. Comput..

[14] S. Agmon. The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[15] Alan M. Frieze,et al. A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.

[16] Noga Alon,et al. The Space Complexity of Approximating the Frequency Moments , 1999 .

[17] Noga Alon,et al. A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem , 1985, J. Algorithms.

[18] Noga Alon,et al. The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[19] Moses Charikar,et al. Near-optimal algorithms for maximum constraint satisfaction problems , 2007, SODA '07.

[20] Prasad Raghavendra,et al. A 3-query PCP over integers , 2007, STOC '07.

[21] I. Anderson. Combinatorics of Finite Sets , 1987 .

[22] Carsten Lund,et al. Proof verification and the hardness of approximation problems , 1998, JACM.

[23] P. Erdös. On a lemma of Littlewood and Offord , 1945 .

[24] Moni Naor,et al. Derandomized Constructions of k-Wise (Almost) Independent Permutations , 2005, APPROX-RANDOM.

[25] Jacques Stern,et al. The hardness of approximate optima in lattices, codes, and systems of linear equations , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[26] Nader H. Bshouty,et al. Maximizing agreements and coagnostic learning , 2006, Theor. Comput. Sci..