A Data-driven Attack against Support Vectors of SVM

Machine learning (ML) is commonly used in multiple disciplines and real-world applications, such as information retrieval, financial systems, health, biometrics and online social networks. However, their security profiles against deliberate attacks have not often been considered. Sophisticated adversaries can exploit specific vulnerabilities exposed by classical ML algorithms to deceive intelligent systems. It is emerging to perform a thorough security evaluation as well as potential attacks against the machine learning techniques before developing novel methods to guarantee that machine learning can be securely applied in adversarial setting. In this paper, an effective attack strategy for crafting foreign support vectors in order to attack a classic ML algorithm, the Support Vector Machine (SVM) has been proposed with mathematical proof. The new attack can minimize the margin around the decision boundary and maximize the hinge loss simultaneously. We evaluate the new attack in different real-world applications including social spam detection, Internet traffic classification and image recognition. Experimental results highlight that the security of classifiers can be worsened by poisoning a small group of support vectors.

[1]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[2]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[3]  Lior Rokach,et al.  Introduction to Recommender Systems Handbook , 2011, Recommender Systems Handbook.

[4]  Susan T. Dumais,et al.  Using Shortlists to Support Decision Making and Improve Recommender System Performance , 2015, WWW.

[5]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Handayani Tjandrasa,et al.  Classification of non-proliferative diabetic retinopathy based on hard exudates using soft margin SVM , 2013, 2013 IEEE International Conference on Control System, Computing and Engineering.

[7]  Marwan A. Al-Namari,et al.  Internet traffic classification using machine learning approach: Datasets validation issues , 2016, 2016 Conference of Basic Sciences and Engineering Studies (SGCAC).

[8]  Malek Ben Salem,et al.  A Survey of Insider Attack Detection Research , 2008, Insider Attack and Cyber Security.

[9]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[10]  Mauro Barni,et al.  Adversary-aware, data-driven detection of double JPEG compression: How to make counter-forensics harder , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[11]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[12]  P. Maher,et al.  Handbook of Matrices , 1999, The Mathematical Gazette.

[13]  Patrick P. K. Chan,et al.  Adversarial Feature Selection Against Evasion Attacks , 2016, IEEE Transactions on Cybernetics.

[14]  Véronique Van Vlasselaer,et al.  Fraud Analytics : Using Descriptive, Predictive, and Social Network Techniques:A Guide to Data Science for Fraud Detection , 2015 .

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  Dawn Xiaodong Song,et al.  Limits of Learning-based Signature Generation with Adversaries , 2008, NDSS.

[17]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[18]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[19]  Sameep Mehta,et al.  An Introduction to Adversarial Machine Learning , 2017, BDA.

[20]  Ling Huang,et al.  Adversarial Active Learning , 2014, AISec '14.

[21]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[22]  J. Doug Tygar,et al.  Evasion and Hardening of Tree Ensemble Classifiers , 2015, ICML.

[23]  James Newsome,et al.  Paragraph: Thwarting Signature Learning by Training Maliciously , 2006, RAID.

[24]  G. Trenkler Handbook of Matrices , 1997 .

[25]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[26]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[27]  Jennifer G. Dy,et al.  Securing virtual execution environments through machine learning-based intrusion detection , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[28]  Alison M Darcy,et al.  Machine Learning and the Profession of Medicine. , 2016, JAMA.

[29]  Shouhuai Xu,et al.  An evasion and counter-evasion study in malicious websites detection , 2014, 2014 IEEE Conference on Communications and Network Security.

[30]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[31]  Claudia Eckert,et al.  Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[32]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[33]  Jun Zhang,et al.  Internet Traffic Classification Using Constrained Clustering , 2014, IEEE Transactions on Parallel and Distributed Systems.

[34]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[35]  Andreas Christmann,et al.  On Robustness Properties of Convex Risk Minimization Methods for Pattern Recognition , 2004, J. Mach. Learn. Res..

[36]  Aloysius K. Mok,et al.  Allergy Attack Against Automatic Signature Generation , 2006, RAID.

[37]  Fabio Roli,et al.  Yes, Machine Learning Can Be More Secure! A Case Study on Android Malware Detection , 2017, IEEE Transactions on Dependable and Secure Computing.

[38]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[39]  Ali Feizollah,et al.  Evaluation of machine learning classifiers for mobile malware detection , 2014, Soft Computing.

[40]  Wei You,et al.  Cracking Classifiers for Evasion: A Case Study on the Google's Phishing Pages Filter , 2016, WWW.

[41]  Jun Zhang,et al.  Statistical Detection of Online Drifting Twitter Spam: Invited Paper , 2016, AsiaCCS.

[42]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[43]  Fabio Roli,et al.  Poisoning Complete-Linkage Hierarchical Clustering , 2014, S+SSPR.

[44]  Gian Luca Marcialis,et al.  Statistical Meta-Analysis of Presentation Attacks for Secure Multibiometric Systems , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[46]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[47]  Taghi M. Khoshgoftaar,et al.  Survey of review spam detection using machine learning techniques , 2015, Journal of Big Data.

[48]  Jun Zhang,et al.  Addressing the class imbalance problem in Twitter spam detection using ensemble learning , 2017, Comput. Secur..

[49]  Blaine Nelson,et al.  Adversarial machine learning , 2019, AISec '11.