论文信息 - Robustness for Non-Parametric Classification: A Generic Attack and Defense

Robustness for Non-Parametric Classification: A Generic Attack and Defense

Adversarially robust machine learning has received much recent attention. However, prior attacks and defenses for non-parametric classifiers have been developed in an ad-hoc or classifier-specific basis. In this work, we take a holistic look at adversarial examples for non-parametric classifiers, including nearest neighbors, decision trees, and random forests. We provide a general defense method, adversarial pruning, that works by preprocessing the dataset to become well-separated. To test our defense, we provide a novel attack that applies to a wide range of non-parametric classifiers. Theoretically, we derive an optimally robust classifier, which is analogous to the Bayes Optimal. We show that adversarial pruning can be viewed as a finite sample approximation to this optimal classifier. We empirically show that our defense and attack are either better than or competitive with prior work on non-parametric classifiers. Overall, our results provide a strong and broadly-applicable baseline for future work on robust non-parametrics. Code available at this https URL .

Cyrus Rashtchian | Kamalika Chaudhuri | Yao-Yuan Yang | Yizhen Wang

[1] Somesh Jha,et al. Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017, ICML.

[2] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[3] Christopher Meek,et al. Adversarial learning , 2005, KDD '05.

[4] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[5] Ananthram Swami,et al. Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[6] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[7] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[8] Lee-Ad Gottlieb,et al. Efficient Classification for Metric Data , 2014, IEEE Trans. Inf. Theory.

[9] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[10] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[11] Abhimanyu Dubey,et al. Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[13] James Bailey,et al. The vulnerability of learning to adversarial perturbation increases with intrinsic dimensionality , 2017, 2017 IEEE Workshop on Information Forensics and Security (WIFS).

[14] Matthias Bethge,et al. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[15] Seth Pettie,et al. Linear-Time Approximation for Maximum Weight Matching , 2014, JACM.

[16] Matthias Hein,et al. Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks , 2019, NeurIPS.

[17] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[18] C. G. Hilborn,et al. The Condensed Nearest Neighbor Rule , 1967 .

[19] Jinfeng Yi,et al. Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[20] Sahil Singla,et al. Robustness Certificates Against Adversarial Examples for ReLU Networks , 2019, ArXiv.

[21] Sanjoy Dasgupta,et al. Rates of Convergence for Nearest Neighbor Classification , 2014, NIPS.

[22] Richard M. Karp,et al. A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[23] Lee-Ad Gottlieb,et al. Near-Optimal Sample Compression for Nearest Neighbors , 2014, IEEE Transactions on Information Theory.

[24] Kun He,et al. Improving the Generalization of Adversarial Training with Domain Adaptation , 2018, ICLR.

[25] Michael I. Jordan,et al. Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[26] G. Lugosi,et al. On the Strong Universal Consistency of Nearest Neighbor Regression Function Estimates , 1994 .

[27] Alessio Lomuscio,et al. Formal Verification of CNN-based Perception Systems , 2018, ArXiv.

[28] David A. Wagner,et al. On the Robustness of Deep K-Nearest Neighbors , 2019, 2019 IEEE Security and Privacy Workshops (SPW).

[29] Mykel J. Kochenderfer,et al. Towards Proving the Adversarial Robustness of Deep Neural Networks , 2017, FVAV@iFM.

[30] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[31] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[32] Matthias Hein,et al. Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation , 2017, NIPS.

[33] Matthias Hein,et al. Provable Robustness of ReLU networks via Maximization of Linear Regions , 2018, AISTATS.

[34] Richard M. Karp,et al. A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[35] Franz Aurenhammer,et al. Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[36] Aleksander Madry,et al. On Evaluating Adversarial Robustness , 2019, ArXiv.

[37] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[38] John C. Duchi,et al. Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[39] Aditi Raghunathan,et al. Certified Defenses against Adversarial Examples , 2018, ICLR.

[40] Aryeh Kontorovich,et al. A Bayes consistent 1-NN classifier , 2014, AISTATS.

[41] G. Gates. The Reduced Nearest Neighbor Rule , 1998 .

[42] Ketan Mulmuley,et al. On levels in arrangements and voronoi diagrams , 1991, Discret. Comput. Geom..

[43] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[44] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[45] Aleksander Madry,et al. Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability , 2018, ICLR.

[46] Alexandros G. Dimakis,et al. Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes , 2019, NeurIPS.

[47] J. Doug Tygar,et al. Evasion and Hardening of Tree Ensemble Classifiers , 2015, ICML.

[48] Russ Tedrake,et al. Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[49] Aryeh Kontorovich,et al. Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions , 2017, NIPS.

[50] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[51] Dylan Hadfield-Menell,et al. Adversarial Training with Voronoi Constraints , 2019, ArXiv.

[52] Nicholas Carlini,et al. Evaluation and Design of Robust Neural Network Defenses , 2018 .

[53] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[54] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[55] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[56] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[57] Patrick D. McDaniel,et al. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[58] Dawn Xiaodong Song,et al. Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[59] Cho-Jui Hsieh,et al. Robust Decision Trees Against Adversarial Examples , 2019 .