Security analysis of online centroid anomaly detection

Security issues are crucial in a number of machine learning applications, especially in scenarios dealing with human activity rather than natural phenomena (e.g., information ranking, spam detection, malware detection, etc.). In such cases, learning algorithms may have to cope with manipulated data aimed at hampering decision making. Although some previous work addressed the issue of handling malicious data in the context of supervised learning, very little is known about the behavior of anomaly detection methods in such scenarios. In this contribution, we analyze the performance of a particular method--online centroid anomaly detection--in the presence of adversarial noise. Our analysis addresses the following security-related issues: formalization of learning and attack processes, derivation of an optimal attack, and analysis of attack efficiency and limitations. We derive bounds on the effectiveness of a poisoning attack against centroid anomaly detection under different conditions: attacker's full or limited control over the traffic and bounded false positive rate. Our bounds show that whereas a poisoning attack can be effectively staged in the unconstrained case, it can be made arbitrarily difficult (a strict upper bound on the attacker's gain) if external constraints are properly used. Our experimental evaluation, carried out on real traces of HTTP and exploit traffic, confirms the tightness of our theoretical bounds and the practicality of our protection mechanisms.

[1]  M. Mohri,et al.  Stability Bounds for Stationary φ-mixing and β-mixing Processes , 2010 .

[2]  Christopher Meek,et al.  Good Word Attacks on Statistical Spam Filters , 2005, CEAS.

[3]  Marius Kloft,et al.  A framework for quantitative security analysis of machine learning , 2009, AISec '09.

[4]  Klaus-Robert Müller,et al.  Incremental Support Vector Learning: Analysis, Implementation and Applications , 2006, J. Mach. Learn. Res..

[5]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[6]  Dit-Yan Yeung,et al.  Parzen-window network intrusion detectors , 2002, Object recognition supported by user interaction for service robots.

[7]  L. Martein,et al.  On solving a linear program with one quadratic constraint , 1987 .

[8]  Salvatore J. Stolfo,et al.  Anagram: A Content Anomaly Detector Resistant to Mimicry Attack , 2006, RAID.

[9]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[10]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[11]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[12]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[13]  Konrad Rieck,et al.  Language models for detection of unknown attacks in network traffic , 2006, Journal in Computer Virology.

[14]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[15]  F. Barthe Learning from Dependent Observations , 2006 .

[16]  David Moore,et al.  Code-Red: a case study on the spread and victims of an internet worm , 2002, IMW '02.

[17]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[18]  Wenke Lee,et al.  Misleading worm signature generators using deliberate noise injection , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[19]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[20]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[21]  Jean-Philippe Vert,et al.  Consistency and Convergence Rates of One-Class SVMs and Related Algorithms , 2006, J. Mach. Learn. Res..

[22]  Klaus-Robert Müller,et al.  Intrusion detection in unlabeled data with quarter-sphere Support Vector Machines , 2004 .

[23]  Zhuoqing Morley Mao,et al.  Automated Classification and Analysis of Internet Malware , 2007, RAID.

[24]  C. van de Panne Programming with a Quadratic Constraint , 1966 .

[25]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[26]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[27]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[28]  Salvatore J. Stolfo,et al.  Casting out Demons: Sanitizing Training Data for Anomaly Sensors , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[29]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Worm Detection and Signature Generation , 2005, RAID.

[30]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[31]  Carsten Willems,et al.  Learning and Classification of Malware Behavior , 2008, DIMVA.

[32]  Konrad Rieck,et al.  Linear-Time Computation of Similarity Measures for Sequential Data , 2008, J. Mach. Learn. Res..

[33]  Eyal Kushilevitz,et al.  PAC learning with nasty noise , 1999, Theor. Comput. Sci..

[34]  A. Joseph,et al.  Bounding an Attack ’ s Complexity for a Simple Learning Model , 2006 .

[35]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[36]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[37]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[38]  Konrad Rieck,et al.  Detecting Unknown Network Attacks Using Language Models , 2006, DIMVA.

[39]  David Moore,et al.  The Spread of the Witty Worm , 2004, IEEE Secur. Priv..

[40]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[41]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[42]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[43]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[44]  Salvatore J. Stolfo,et al.  On the infeasibility of modeling polymorphic shellcode , 2009, Machine Learning.

[45]  Peter L. Bartlett,et al.  Open problems in the security of learning , 2008, AISec '08.

[46]  Salvatore J. Stolfo,et al.  Adaptive Anomaly Detection via Self-calibration and Dynamic Updating , 2009, RAID.

[47]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[48]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML '08.

[49]  Ming-Yang Kao,et al.  Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[50]  Wenke Lee,et al.  Evading network anomaly detection systems: formal reasoning and practical techniques , 2006, CCS '06.

[51]  Don R. Hush,et al.  A Classification Framework for Anomaly Detection , 2005, J. Mach. Learn. Res..

[52]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[53]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[54]  Wenke Lee,et al.  Polymorphic Blending Attacks , 2006, USENIX Security Symposium.

[55]  Marimuthu Palaniswami,et al.  Quarter Sphere Based Distributed Anomaly Detection in Wireless Sensor Networks , 2007, 2007 IEEE International Conference on Communications.

[56]  A. Tsybakov On nonparametric estimation of density level sets , 1997 .

[57]  Joachim M. Buhmann,et al.  On Relevant Dimensions in Kernel Feature Spaces , 2008, J. Mach. Learn. Res..

[58]  W. Polonik Measuring Mass Concentrations and Estimating Density Contour Clusters-An Excess Mass Approach , 1995 .

[59]  P. Laskov,et al.  Intrusion Detection in Unlabeled Data with Quarter-sphere Support Vector Machines , 2004, Prax. Inf.verarb. Kommun..

[60]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[61]  Salvatore J. Stolfo,et al.  Anomalous Payload-Based Network Intrusion Detection , 2004, RAID.

[62]  James Newsome,et al.  Paragraph: Thwarting Signature Learning by Training Maliciously , 2006, RAID.

[63]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[64]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[65]  Peter Auer Learning Nested Differences in the Presence of Malicious Noise , 1997, Theor. Comput. Sci..