The Complexity of Adversarially Robust Proper Learning of Halfspaces with Agnostic Noise

We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent agnostic PAC model, with a focus on $L_p$ perturbations. We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem. An interesting implication of our findings is that the $L_{\infty}$ perturbations case is provably computationally harder than the case $2 \leq p < \infty$.

[1]  Pasin Manurangsi,et al.  Parameterized Intractability of Even Set and Shortest Vector Problem from Gap-ETH , 2018, Electron. Colloquium Comput. Complex..

[2]  Ilias Diakonikolas,et al.  Efficiently Learning Adversarially Robust Halfspaces with Noise , 2020, ICML.

[3]  Vitaly Feldman,et al.  New Results for Learning Noisy Parities and Halfspaces , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[4]  Aravindan Vijayaraghavan,et al.  On Robustness to Adversarial Examples and Polynomial Optimization , 2019, NeurIPS.

[5]  Luca Trevisan,et al.  From Gap-ETH to FPT-Inapproximability: Clique, Dominating Set, and More , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[6]  Johan Håstad,et al.  Some optimal inapproximability results , 2001, JACM.

[7]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[8]  Ohad Shamir,et al.  Learning Kernel-Based Halfspaces with the Zero-One Loss , 2010, COLT 2010.

[9]  Maria-Florina Balcan,et al.  A New Perspective on Learning Linear Separators with Large \(L_qL_p\) Margins , 2014, AISTATS.

[10]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[11]  Claudio Gentile,et al.  The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[12]  Lars Engebretsen,et al.  Clique Is Hard To Approximate Within , 2000 .

[13]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[14]  Russell Impagliazzo,et al.  Complexity of k-SAT , 1999, Proceedings. Fourteenth Annual IEEE Conference on Computational Complexity (Formerly: Structure in Complexity Theory Conference) (Cat.No.99CB36317).

[15]  Rocco A. Servedio,et al.  Learning large-margin halfspaces with more malicious noise , 2011, NIPS.

[16]  Claudio Gentile,et al.  A New Approximate Maximal Margin Classification Algorithm , 2002, J. Mach. Learn. Res..

[17]  Daniel M. Kane,et al.  Nearly Tight Bounds for Robust Proper Learning of Halfspaces with a Margin , 2019, NeurIPS.

[18]  Prasad Raghavendra,et al.  A Birthday Repetition Theorem and Complexity of Approximating Dense CSPs , 2016, ICALP.

[19]  Nathan Srebro,et al.  VC Classes are Adversarially Robustly Learnable, but Only Improperly , 2019, COLT.

[20]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[21]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[22]  Dániel Marx Completely Inapproximable Monotone and Antimonotone Parameterized Problems , 2010, 2010 IEEE 25th Annual Conference on Computational Complexity.

[23]  Ran Raz,et al.  A parallel repetition theorem , 1995, STOC '95.

[24]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[25]  Ran Raz,et al.  Two Query PCP with Sub-Constant Error , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[26]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[27]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[28]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[29]  Dániel Marx,et al.  Lower bounds based on the Exponential Time Hypothesis , 2011, Bull. EATCS.

[30]  Prasad Raghavendra,et al.  Agnostic Learning of Monomials by Halfspaces Is Hard , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[31]  Pasin Manurangsi,et al.  On the parameterized complexity of approximating dominating set , 2017, Electron. Colloquium Comput. Complex..

[32]  J. Håstad Clique is hard to approximate within n 1-C , 1996 .

[33]  Russell Impagliazzo,et al.  Which problems have strongly exponential complexity? , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[34]  Alexander Golovnev,et al.  On the Quantitative Hardness of CVP , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[35]  Ilya P. Razenshteyn,et al.  Adversarial examples from computational constraints , 2018, ICML.

[36]  Bingkai Lin,et al.  A Simple Gap-producing Reduction for the Parameterized Set Cover Problem , 2019, ICALP.

[37]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[38]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[39]  Shai Shalev-Shwartz,et al.  Learning Halfspaces with the Zero-One Loss: Time-Accuracy Tradeoffs , 2012, NIPS.

[40]  Prateek Mittal,et al.  PAC-learning in the presence of adversaries , 2018, NeurIPS.

[41]  Jacques Stern,et al.  The Hardness of Approximate Optima in Lattices, Codes, and Systems of Linear Equations , 1997, J. Comput. Syst. Sci..

[42]  Amit Kumar,et al.  Tight FPT Approximations for $k$-Median and k-Means , 2019, ICALP.

[43]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[44]  Andrej Risteski,et al.  Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective , 2018, STOC.

[45]  Irit Dinur,et al.  Mildly exponential reduction from gap 3SAT to polynomial-gap label-cover , 2016, Electron. Colloquium Comput. Complex..

[46]  Vinod Vaikuntanathan,et al.  Computational Limitations in Robust Classification and Win-Win Results , 2019, IACR Cryptol. ePrint Arch..

[47]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[48]  Dale Schuurmans,et al.  General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[49]  A. C. Berry The accuracy of the Gaussian approximation to the sum of independent variates , 1941 .

[50]  Pasin Manurangsi Tight Running Time Lower Bounds for Strong Inapproximability of Maximum k-Coverage, Unique Set Cover and Related Problems (via t-Wise Agreement Testing Theorem) , 2020, SODA.

[51]  Pasin Manurangsi,et al.  ETH-Hardness of Approximating 2-CSPs and Directed Steiner Network , 2018, ITCS.

[52]  Pasin Manurangsi,et al.  On the Parameterized Complexity of Approximating Dominating Set , 2019, J. ACM.

[53]  Yijia Chen,et al.  The Constant Inapproximability of the Parameterized Dominating Set Problem , 2015, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[54]  Prasad Raghavendra,et al.  Hardness of Learning Halfspaces with Noise , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[55]  Shai Shalev-Shwartz,et al.  Agnostically Learning Halfspaces with Margin Errors , 2009 .

[56]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[57]  Aleksander Madry,et al.  Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[58]  Hans Ulrich Simon,et al.  Efficient Learning of Linear Perceptrons , 2000, NIPS.

[59]  David Steurer,et al.  Analytical approach to parallel repetition , 2013, STOC.

[60]  Divesh Aggarwal,et al.  (Gap/S)ETH hardness of SVP , 2017, STOC.