Flow-based Detection and Proxy-based Evasion of Encrypted Malware C2 Traffic

State of the art deep learning techniques are known to be vulnerable to evasion attacks where an adversarial sample is generated from a malign sample and misclassified as benign. Detection of encrypted malware command and control traffic based on TCP/IP flow features can be framed as a learning task and is thus vulnerable to evasion attacks. However, unlike e.g. in image processing where generated adversarial samples can be directly mapped to images, going from flow features to actual TCP/IP packets requires crafting the sequence of packets, with no established approach for such crafting and a limitation on the set of modifiable features that such crafting allows.In this paper we discuss learning and evasion consequences of the gap between generated and crafted adversarial samples. We exemplify with a deep neural network detector trained on a public C2 traffic dataset, white-box adversarial learning, and a proxy-based approach for crafting longer flows. Our results show 1) the high evasion rate obtained by using generated adversarial samples on the detector can be significantly reduced when using crafted adversarial samples; 2) robustness against adversarial samples by model hardening varies according to the crafting approach and corresponding set of modifiable features that the attack allows for; 3) incrementally training hardened models with adversarial samples can produce a level playing field where no detector is best against all attacks and no attack is best against all detectors, in a given set of attacks and detectors. To the best of our knowledge this is the first time that level playing field feature set- and iteration-hardening are analyzed in encrypted C2 malware traffic detection.

[1]  Dongqi Han,et al.  Practical Traffic-space Adversarial Attacks on Learning-based NIDSs , 2020, ArXiv.

[2]  Pedro Casas,et al.  Deep in the Dark - Deep Learning-Based Malware Traffic Detection Without Expert Knowledge , 2019, 2019 IEEE Security and Privacy Workshops (SPW).

[3]  Maria Rigaki,et al.  Bringing a GAN to a Knife-Fight: Adapting Malware Communication to Avoid Detection , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[4]  Ming Zhu,et al.  Malware traffic classification using convolutional neural network for representation learning , 2017, 2017 International Conference on Information Networking (ICOIN).

[5]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[6]  Andreas Hotho,et al.  Flow-based Network Traffic Generation using Generative Adversarial Networks , 2018, Comput. Secur..

[7]  Fabio Roli,et al.  Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[8]  Dario Rossi,et al.  Traffic Analysis with Off-the-Shelf Hardware: Challenges and Lessons Learned , 2017, IEEE Communications Magazine.

[9]  Michele Colajanni,et al.  Evading Botnet Detectors Based on Flows and Random Forest with Adversarial Samples , 2018, 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA).

[10]  Zhi Xue,et al.  IDSGAN: Generative Adversarial Networks for Attack Generation against Intrusion Detection , 2018, PAKDD.

[11]  GardinerJoseph,et al.  On the Security of Machine Learning in Malware C&C Detection , 2016 .

[12]  Subharthi Paul,et al.  Deciphering malware’s use of TLS (without decryption) , 2016, Journal of Computer Virology and Hacking Techniques.

[13]  Pavel Laskov,et al.  Practical Evasion of a Learning-Based Classifier: A Case Study , 2014, 2014 IEEE Symposium on Security and Privacy.

[14]  Blake Anderson,et al.  Machine Learning for Encrypted Malware Traffic Classification: Accounting for Noisy Labels and Non-Stationarity , 2017, KDD.

[15]  Adriel Cheng,et al.  PAC-GAN: Packet Generation of Network Traffic using Generative Adversarial Networks , 2019, 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).

[16]  Blake Anderson,et al.  Limitless HTTP in an HTTPS World: Inferring the Semantics of the HTTPS Protocol without Decryption , 2018, CODASPY.

[17]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .