Practical Attacks Against Graph-based Clustering

Graph modeling allows numerous security problems to be tackled in a general way, however, little work has been done to understand their ability to withstand adversarial attacks. We design and evaluate two novel graph attacks against a state-of-the-art network-level, graph-based detection system. Our work highlights areas in adversarial machine learning that have not yet been addressed, specifically: graph-based clustering techniques, and a global feature space where realistic attackers without perfect knowledge must be accounted for (by the defenders) in order to be practical. Even though less informed attackers can evade graph clustering with low cost, we show that some practical defenses are possible.

[1]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[2]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[3]  Wenke Lee,et al.  Beheading hydras: performing effective botnet takedowns , 2013, CCS.

[4]  Shyhtsun Felix Wu,et al.  On Attacking Statistical Spam Filters , 2004, CEAS.

[5]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[6]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[7]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[8]  William K. Robertson,et al.  Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks , 2013, ACSAC.

[9]  Babak Rahbarinia,et al.  Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[10]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[11]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[12]  Kymie M. C. Tan,et al.  Undermining an Anomaly-Based Intrusion Detection System Using Common Exploits , 2002, RAID.

[13]  Nick Feamster,et al.  Building a Dynamic Reputation System for DNS , 2010, USENIX Security Symposium.

[14]  András A. Benczúr,et al.  Geographically Organized Small Communities and the Hardness of Clustering Social Networks , 2010, Data Mining for Social Network Data.

[15]  Roger Guimerà,et al.  Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences.

[16]  Yizheng Chen,et al.  Measuring Network Reputation in the Ad-Bidding Process , 2017, DIMVA.

[17]  Christopher Meek,et al.  Good Word Attacks on Statistical Spam Filters , 2005, CEAS.

[18]  Fabio Roli,et al.  Is data clustering in adversarial settings secure? , 2013, AISec.

[19]  Roberto Perdisci,et al.  Towards Measuring and Mitigating Social Engineering Software Download Attacks , 2016, USENIX Security Symposium.

[20]  Pan Peng,et al.  The small-community phenomenon in networks† , 2011, Mathematical Structures in Computer Science.

[21]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[22]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[23]  Carsten Willems,et al.  Automatic analysis of malware behavior using machine learning , 2011, J. Comput. Secur..

[24]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[25]  Wenke Lee,et al.  Connected Colors: Unveiling the Structure of Criminal Networks , 2013, RAID.

[26]  Fabio Roli,et al.  Poisoning behavioral malware clustering , 2014, AISec '14.

[27]  David A. Wagner,et al.  Mimicry attacks on host-based intrusion detection systems , 2002, CCS '02.

[28]  Yanjun Qi,et al.  Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers , 2016, NDSS.

[29]  Angelos Stavrou,et al.  Malicious PDF detection using metadata and structural features , 2012, ACSAC '12.

[30]  Roger Guimerà,et al.  Correction for Sales-Pardo et al., Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences of the United States of America.

[31]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[32]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[33]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[34]  Zhou Li,et al.  Operational Security Log Analytics for Enterprise Breach Detection , 2016, 2016 IEEE Cybersecurity Development (SecDev).

[35]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[36]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[37]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[38]  Kevin J. Lang Fixing two weaknesses of the Spectral Method , 2005, NIPS.

[39]  Omri Weinstein,et al.  ETH Hardness for Densest-k-Subgraph with Perfect Completeness , 2015, SODA.

[40]  Juan Caballero,et al.  AVclass: A Tool for Massive Malware Labeling , 2016, RAID.

[41]  Angelos D. Keromytis,et al.  I am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs , 2016, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[42]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[43]  Yizheng Chen,et al.  DNS Noise: Measuring the Pervasiveness of Disposable Domains in Modern DNS Traffic , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[44]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[45]  Christopher Krügel,et al.  Nazca: Detecting Malware Distribution in Large-Scale Networks , 2014, NDSS.

[46]  Roberto Perdisci,et al.  WebWitness: Investigating, Categorizing, and Mitigating Malware Download Paths , 2015, USENIX Security Symposium.

[47]  Nick Feamster,et al.  Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces , 2010, NSDI.

[48]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[49]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[50]  Yizheng Chen,et al.  Financial Lower Bounds of Online Advertising Abuse - A Four Year Case Study of the TDSS/TDL4 Botnet , 2016, DIMVA.

[51]  Wenke Lee,et al.  Evading network anomaly detection systems: formal reasoning and practical techniques , 2006, CCS '06.

[52]  Christos Faloutsos,et al.  Polonium: Tera-Scale Graph Mining and Inference for Malware Detection , 2011 .