Infinity-Norm Support Vector Machines Against Adversarial Label Contamination

Nowadays machine-learning algorithms are increasingly being applied in security-related applications like spam and malware detection, aiming to detect never-before-seen attacks and novel threats. However, such techniques may expose specific vulnerabilities that may be exploited by carefully-crafted attacks. Support Vector Machines (SVMs) are a well-known and widely-used learning algorithm. They make their decisions based on a subset of the training samples, known as support vectors. We first show that this behaviour poses risks to system security, if the labels of a subset of the training samples can be manipulated by an intelligent and adaptive attacker. We then propose a countermeasure that can be applied to mitigate this issue, based on infinity-norm regularization. The underlying rationale is to increase the number of support vectors and balance more equally their contribution to the decision function, to decrease the impact of the contaminating samples during training. Finally, we empirically show that the proposed defence strategy, referred to as Infinity-norm SVM, can significantly improve classifier security under malicious label contamination in a real-world classification task involving malware detection.

[1]  Claudia Eckert,et al.  Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[2]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[3]  Fabio Roli,et al.  On Security and Sparsity of Linear Classifiers for Adversarial Settings , 2016, S+SSPR.

[4]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[5]  Fabio Roli,et al.  Pattern Recognition Systems under Attack: Design Issues and Research Challenges , 2014, Int. J. Pattern Recognit. Artif. Intell..

[6]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[7]  Jean-Yves Audibert Optimization for Machine Learning , 1995 .

[8]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[9]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[10]  Eyal Kushilevitz,et al.  PAC learning with nasty noise , 1999, Theor. Comput. Sci..

[11]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Claudia Eckert,et al.  Adversarial Label Flips Attack on Support Vector Machines , 2012, ECAI.

[13]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[15]  Blaine Nelson,et al.  Microbagging Estimators: An Ensemble Approach to Distance-weighted Classifiers , 2011, ACML.

[16]  Gang Wang,et al.  Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers , 2014, USENIX Security Symposium.

[17]  Alexander Binder,et al.  Learning and Evaluation in Presence of Non-i.i.d. Label Noise , 2014, AISTATS.

[18]  Fabio Roli,et al.  Adversarial Biometric Recognition : A review on biometric system security from the adversarial machine-learning perspective , 2015, IEEE Signal Processing Magazine.

[19]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[20]  Giorgio Giacinto,et al.  Lux0R: Detection of Malicious PDF-embedded JavaScript code through Discriminant Analysis of API References , 2014, AISec '14.

[21]  Fabio Roli,et al.  Secure Kernel Machines against Evasion Attacks , 2016, AISec@CCS.

[22]  Angelos Stavrou,et al.  Malicious PDF detection using metadata and structural features , 2012, ACSAC '12.