On the Power of Abstention and Data-Driven Decision Making for Adversarial Robustness

We prove that classifiers with the ability to abstain are provably more powerful than those that cannot against an adversary that can perturb datapoints by arbitrary amounts in random directions. Specifically, we show that no matter how well-behaved the natural data is, any classifier that cannot abstain will be defeated by such an adversary. However, by allowing abstention, we give a parameterized algorithm with provably good performance against such an adversary when classes are reasonably well-separated and the data dimension is high. We further use a data-driven method to set our algorithm parameters to optimize over the accuracy vs. abstention trade-off with strong theoretical guarantees. Our theory has direct applications to the technique of contrastive learning, where we empirically demonstrate the ability of our algorithms to obtain high robust accuracy with only small amounts of abstention in both supervised and self-supervised settings.

[1]  Xiangyu Zhang,et al.  Towards Feature Space Adversarial Attack , 2020, ArXiv.

[2]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[3]  Kamalika Chaudhuri,et al.  When are Non-Parametric Methods Robust? , 2020, ICML.

[4]  Cyrus Rashtchian,et al.  Adversarial Robustness Through Local Lipschitzness , 2020, ArXiv.

[5]  John Duchi,et al.  Understanding and Mitigating the Tradeoff Between Robustness and Accuracy , 2020, ICML.

[6]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[7]  Avrim Blum,et al.  Random Smoothing Might be Unable to Certify 𝓁∞ Robustness for High-Dimensional Images , 2020, J. Mach. Learn. Res..

[8]  Soheil Feizi,et al.  Playing it Safe: Adversarial Robustness with an Abstain Option , 2019, ArXiv.

[9]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Tao Yu,et al.  A New Defense Against Adversarial Images: Turning a Weakness into a Strength , 2019, NeurIPS.

[11]  Bernt Schiele,et al.  Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks , 2019, ICML.

[12]  R. Venkatesh Babu,et al.  FDA: Feature Disruptive Attack , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Maria-Florina Balcan,et al.  Learning piecewise Lipschitz functions in changing environments , 2019, AISTATS.

[14]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[15]  Kamalika Chaudhuri,et al.  Robustness for Non-Parametric Classification: A Generic Attack and Defense , 2019, AISTATS.

[16]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[17]  Ali Razavi,et al.  Data-Efficient Image Recognition with Contrastive Predictive Coding , 2019, ICML.

[18]  Chengxu Zhuang,et al.  Local Aggregation for Unsupervised Learning of Visual Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Geoffrey E. Hinton,et al.  Analyzing and Improving Representations with the Soft Nearest Neighbor Loss , 2019, ICML.

[20]  Adi Shamir,et al.  A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance , 2019, ArXiv.

[21]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[22]  Nicholas Carlini,et al.  Unrestricted Adversarial Examples , 2018, ArXiv.

[23]  R. Devon Hjelm,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[24]  Matthias Bethge,et al.  Adversarial Vision Challenge , 2018, The NeurIPS '18 Competition.

[25]  Ryan P. Adams,et al.  Motivating the Rules of the Game for Adversarial Example Research , 2018, ArXiv.

[26]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[27]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[29]  Giovanni S. Alberti,et al.  ADef: an Iterative Algorithm to Construct Adversarial Deformations , 2018, ICLR.

[30]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[31]  Mingyan Liu,et al.  Spatially Transformed Adversarial Examples , 2018, ICLR.

[32]  Aleksander Madry,et al.  Exploring the Landscape of Spatial Robustness , 2017, ICML.

[33]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[34]  Maria-Florina Balcan,et al.  Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization , 2017, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[35]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[36]  Kevin Leyton-Brown,et al.  Efficiency Through Procrastination: Approximately Optimal Algorithm Configuration with Runtime Guarantees , 2017, IJCAI.

[37]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[38]  Somesh Jha,et al.  Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017, ICML.

[39]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[40]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[41]  Ran El-Yaniv,et al.  Selective Classification for Deep Neural Networks , 2017, NIPS.

[42]  Maria-Florina Balcan,et al.  A General Theory of Sample Complexity for Multi-Item Profit Maximization , 2017, EC.

[43]  Daniel Cullina,et al.  Enhancing robustness of machine learning systems via data transformations , 2017, 2018 52nd Annual Conference on Information Sciences and Systems (CISS).

[44]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[45]  Radha Poovendran,et al.  Blocking Transferability of Adversarial Examples in Black-Box Learning Systems , 2017, ArXiv.

[46]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[47]  J. H. Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[48]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Maria-Florina Balcan,et al.  Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems , 2016, COLT.

[50]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[51]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Tim Roughgarden,et al.  A PAC Approach to Application-Specific Algorithm Selection , 2015, SIAM J. Comput..

[53]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[56]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[57]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[58]  Sanjoy Dasgupta,et al.  Rates of convergence for the cluster tree , 2010, NIPS.

[59]  F. Hutter,et al.  ParamILS: An Automatic Algorithm Configuration Framework , 2014, J. Artif. Intell. Res..

[60]  Antonia J. Jones,et al.  Asymptotic moments of near–neighbour distance distributions , 2002, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.