Hardening Classifiers against Evasion: the Good, the Bad, and the Ugly

Machine learning is widely used in security applications, particularly in the form of statistical classification aimed at distinguishing benign from malicious entities. Recent research has shown that such classifiers are often vulnerable to evasion attacks, whereby adversaries change behavior to be categorized as benign while preserving malicious functionality. Research into evasion attacks has followed two paradigms: attacks in problem space, where the actual malicious instance such as the PDF file, is modified, and attacks in feature space, where the evasion attack is abstracted into directly modifying numerical features corresponding to malicious instances rather than instances themselves. However, there exists no prior validation of the effectiveness of feature space threat models in representing real evasion attacks. We make several contributions to address this gap, using PDF malware detection as a case study, with four PDF malware detectors. First, we use iterative retraining to create a baseline for evasion-robust PDF malware detection by using an automated problem space attack generator in the retraining loop. Second, we use this baseline to demonstrate that replacing problem space attacks with feature space attacks may significantly reduce the robustness of the resulting classifier. Third, we demonstrate the existence of conserved (or invariant) features, show how these can be leveraged to design evasion-robust classifiers that are nearly as effective as those relying on the problem space attack, and present an approach for automatically identifying conserved features of PDF malware detectors. Finally, we evaluate generalizability of evasion defense through retraining by considering two additional evasion attacks.

[1]  Ying Tan,et al.  Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN , 2017, DMBD.

[2]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[3]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[4]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, ArXiv.

[5]  Christopher Krügel,et al.  Detection and analysis of drive-by-download attacks and malicious JavaScript code , 2010, WWW '10.

[6]  Yanjun Qi,et al.  Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers , 2016, NDSS.

[7]  Bo Li,et al.  Evasion-Robust Classification on Binary Domains , 2018, ACM Trans. Knowl. Discov. Data.

[8]  Patrick D. McDaniel,et al.  Adversarial Perturbations Against Deep Neural Networks for Malware Classification , 2016, ArXiv.

[9]  Angelos Stavrou,et al.  Malicious PDF detection using metadata and structural features , 2012, ACSAC '12.

[10]  Bhavani M. Thuraisingham,et al.  Adversarial support vector machine learning , 2012, KDD.

[11]  Ling Huang,et al.  Near-Optimal Evasion of Convex-Inducing Classifiers , 2010, AISTATS.

[12]  Tobias Scheffer,et al.  Stackelberg games for adversarial prediction problems , 2011, KDD.

[13]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[14]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[15]  Christos H. Papadimitriou,et al.  Strategic Classification , 2015, ITCS.

[16]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[17]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[18]  Yevgeniy Vorobeychik,et al.  A General Retraining Framework for Scalable Adversarial Classification , 2016, ArXiv.

[19]  Wenke Lee,et al.  Polymorphic Blending Attacks , 2006, USENIX Security Symposium.

[20]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[21]  Yevgeniy Vorobeychik,et al.  Optimal randomized classification in adversarial settings , 2014, AAMAS.

[22]  Pavel Laskov,et al.  Hidost: a static machine-learning-based detector of malicious files , 2016, EURASIP J. Inf. Secur..

[23]  Yevgeniy Vorobeychik,et al.  Feature Cross-Substitution in Adversarial Classification , 2014, NIPS.

[24]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[25]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[26]  Wenke Lee,et al.  Evading network anomaly detection systems: formal reasoning and practical techniques , 2006, CCS '06.

[27]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[28]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[29]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[30]  Michael P. Wellman,et al.  SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[31]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[32]  Fabio Roli,et al.  Secure Kernel Machines against Evasion Attacks , 2016, AISec@CCS.

[33]  J. Doug Tygar,et al.  Evasion and Hardening of Tree Ensemble Classifiers , 2015, ICML.

[34]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[35]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[36]  Tobias Scheffer,et al.  Static prediction games for adversarial learning problems , 2012, J. Mach. Learn. Res..

[37]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[38]  Ling Huang,et al.  Query Strategies for Evading Convex-Inducing Classifiers , 2010, J. Mach. Learn. Res..

[39]  Ling Huang,et al.  Classifier Evasion: Models and Open Problems , 2010, PSDML.

[40]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[41]  Alexander J. Smola,et al.  Convex Learning with Invariances , 2007, NIPS.

[42]  Pavel Laskov,et al.  Detection of Malicious PDF Files Based on Hierarchical Document Structure , 2013, NDSS.

[43]  Patrick P. K. Chan,et al.  Adversarial Feature Selection Against Evasion Attacks , 2016, IEEE Transactions on Cybernetics.

[44]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[45]  Giorgio Giacinto,et al.  Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection , 2013, ASIA CCS '13.