论文信息 - Defending Support Vector Machines Against Data Poisoning Attacks

Defending Support Vector Machines Against Data Poisoning Attacks

Support Vector Machines (SVMs) are vulnerable to targeted training data manipulations such as poisoning attacks and label flips. By carefully manipulating a subset of training samples, the attacker forces the learner to compute an incorrect decision boundary, thereby causing misclassifications. Considering the increased importance of SVMs in engineering and life-critical applications, we develop a novel defense algorithm that improves resistance against such attacks. Local Intrinsic Dimensionality (LID) is a promising metric that characterizes the outlierness of data samples. In this work, we introduce a new approximation of LID called K-LID that uses kernel distance in the LID calculation, which allows LID to be calculated in high dimensional transformed spaces. We introduce a weighted SVM against such attacks using K-LID as a distinguishing characteristic that de-emphasizes the effect of suspicious data samples on the SVM decision boundary. Each sample is weighted on how likely its K-LID value is from the benign K-LID distribution rather than the attacked K-LID distribution. Experiments with benchmark data sets show that the proposed defense reduces classification error rates substantially (10% on average).

[1] Fabio Roli,et al. Security Evaluation of Pattern Classifiers under Attack , 2014, ArXiv.

[2] Jon Louis Bentley,et al. Multidimensional binary search trees used for associative searching , 1975, CACM.

[3] Nagarajan Natarajan,et al. Learning with Noisy Labels , 2013, NIPS.

[4] Percy Liang,et al. Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[5] Zehang Sun,et al. On-road vehicle detection using Gabor filters and support vector machines , 2002, 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628).

[6] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[7] Dacheng Tao,et al. Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] James Bailey,et al. Dimensionality-Driven Learning with Noisy Labels , 2018, ICML.

[9] Mangui Liang,et al. Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises , 2013, Neurocomputing.

[10] Patrick Cardinal,et al. A Robust Approach for Securing Audio Classification Against Adversarial Attacks , 2019, IEEE Transactions on Information Forensics and Security.

[11] Claudia Eckert,et al. Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[12] M. Verleysen,et al. Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[13] Yiming Yang,et al. Robustness of regularized linear classification methods in text categorization , 2003, SIGIR.

[14] Pedro M. Domingos,et al. Adversarial classification , 2004, KDD.

[15] Michael E. Houle,et al. Local Intrinsic Dimensionality II: Multivariate Analysis and Distributional Support , 2017, SISAP.

[16] Naresh Manwani,et al. Noise Tolerance Under Risk Minimization , 2011, IEEE Transactions on Cybernetics.

[17] J. Doug Tygar,et al. Adversarial machine learning , 2019, AISec '11.

[18] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[19] Yevgeniy Vorobeychik,et al. A General Retraining Framework for Scalable Adversarial Classification , 2016, ArXiv.

[20] Michael E. Houle,et al. Dimensionality, Discriminability, Density and Distance Distributions , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[21] Stephen M. Omohundro,et al. Five Balltree Construction Algorithms , 2009 .

[22] Marimuthu Palaniswami,et al. Centered Hyperspherical and Hyperellipsoidal One-Class Support Vector Machines for Anomaly Detection in Sensor Networks , 2010, IEEE Transactions on Information Forensics and Security.

[23] XuLei Yang,et al. Weighted support vector machine for data classification , 2005 .

[24] P. J. Green,et al. Density Estimation for Statistics and Data Analysis , 1987 .

[25] James Bailey,et al. The vulnerability of learning to adversarial perturbation increases with intrinsic dimensionality , 2017, 2017 IEEE Workshop on Information Forensics and Security (WIFS).

[26] Blaine Nelson,et al. Poisoning Attacks against Support Vector Machines , 2012, ICML.

[27] Bhavani M. Thuraisingham,et al. Adversarial support vector machine learning , 2012, KDD.

[28] Ken-ichi Kawarabayashi,et al. Estimating Local Intrinsic Dimensionality , 2015, KDD.

[29] Michael E. Houle,et al. Local Intrinsic Dimensionality I: An Extreme-Value-Theoretic Foundation for Similarity Applications , 2017, SISAP.

[30] András Varga,et al. An overview of the OMNeT++ simulation environment , 2008, SimuTools.

[31] André Carlos Ponce de Leon Ferreira de Carvalho,et al. Pre-processing for noise detection in gene expression classification data , 2009, Journal of the Brazilian Computer Society.

[32] James Bailey,et al. Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[33] Ricky Laishram,et al. Curie: A method for protecting SVM Classifier from Poisoning Attack , 2016, ArXiv.

[34] R. Michael Buehrer,et al. Evaluating Adversarial Evasion Attacks in the Context of Wireless Communications , 2019, IEEE Transactions on Information Forensics and Security.

[35] Albert Fornells,et al. A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[36] Christopher Leckie,et al. Detection of Anomalous Communications with SDRs and Unsupervised Adversarial Learning , 2018, 2018 IEEE 43rd Conference on Local Computer Networks (LCN).

[37] Blaine Nelson,et al. Support Vector Machines Under Adversarial Label Noise , 2011, ACML.