GAN Tunnel: Network Traffic Steganography by Using GANs to Counter Internet Traffic Classifiers

In this paper, we introduce a novel traffic masking method, called Generative Adversarial Network (GAN) tunnel, to protect the identity of applications that generate network traffic from classification by adversarial Internet traffic classifiers (ITCs). Such ITCs have been used in the past for website fingerprinting and detection of network protocols. Their use is becoming more ubiquitous than before for inferring user information. ITCs based on machine learning can identify user applications by analyzing the statistical features of encrypted packets. Our proposed GAN tunnel generates traffic that mimics a decoy application and encapsulates actual user traffic in the GAN-generated traffic to prevent classification from adversarial ITCs. We show that the statistical distributions of the generated traffic features closely resemble those of the actual network traffic. Therefore, the actual user applications and information associated with the user remain anonymous. We test the GAN tunnel traffic against high-performing ITCs, such as Random Forest and eXtreme Gradient Boosting (XGBoost), and we show that the GAN tunnel protects the identity of the source applications effectively.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Keechul Jung,et al.  GPU implementation of neural networks , 2004, Pattern Recognit..

[3]  Roberto Rojas-Cessa,et al.  Tracking User Application Activity by using Machine Learning Techniques on Network Traffic , 2019, 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC).

[4]  Riccardo Bettati,et al.  Analytical and empirical analysis of countermeasures to traffic analysis attacks , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[5]  Maria Rigaki,et al.  Bringing a GAN to a Knife-Fight: Adapting Malware Communication to Avoid Detection , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[6]  Ran Liu,et al.  Investigation of machine learning based network traffic classification , 2017, 2017 International Symposium on Wireless Communication Systems (ISWCS).

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Luís Bernardo,et al.  Machine Learning in Software Defined Networks: Data collection and traffic classification , 2016, 2016 IEEE 24th International Conference on Network Protocols (ICNP).

[9]  Kristin L. Sainani,et al.  Logistic Regression , 2014, PM & R : the journal of injury, function, and rehabilitation.

[10]  Riccardo Bettati,et al.  Active traffic analysis attacks and countermeasures , 2003, 2003 International Conference on Computer Networks and Mobile Computing, 2003. ICCNMC 2003..

[11]  Matus Telgarsky,et al.  Size-Noise Tradeoffs in Generative Networks , 2018, NeurIPS.

[12]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[13]  Anil Kumar Sharma,et al.  An Effective DoS Prevention System to Analysis and Prediction of Network Traffic Using Support Vector Machine Learning , 2013 .

[14]  K. Balanda,et al.  Kurtosis: A Critical Review , 1988 .

[15]  Thomas Ristenpart,et al.  Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail , 2012, 2012 IEEE Symposium on Security and Privacy.

[16]  Mahdi Jafari Siavoshani,et al.  Deep packet: a novel approach for encrypted traffic classification using deep learning , 2017, Soft Computing.

[17]  Radu State,et al.  SynGAN: Towards Generating Synthetic Network Attacks using GANs , 2019, ArXiv.

[18]  Charles V. Wright,et al.  Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis , 2009, NDSS.

[19]  Eric Heim,et al.  Constrained Generative Adversarial Networks for Interactive Image Generation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Zigang Cao,et al.  A Survey on Encrypted Traffic Classification , 2014 .

[23]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[24]  Jesús E. Díaz-Verdejo,et al.  A multilevel taxonomy and requirements for an optimal traffic‐classification model , 2014, Int. J. Netw. Manag..

[25]  Mauro Conti,et al.  On defending against label flipping attacks on malware detection systems , 2019, Neural Computing and Applications.

[26]  Roberto Rojas-Cessa,et al.  Identification of User Application by an External Eavesdropper using Machine Learning Analysis on Network Traffic , 2019, 2019 IEEE International Conference on Communications Workshops (ICC Workshops).

[27]  R. Real,et al.  The Probabilistic Basis of Jaccard's Index of Similarity , 1996 .

[28]  Thomas Engel,et al.  Website fingerprinting in onion routing based anonymization networks , 2011, WPES.

[29]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[30]  Zhi Xue,et al.  IDSGAN: Generative Adversarial Networks for Attack Generation against Intrusion Detection , 2018, PAKDD.

[31]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[32]  Nguyen Quang Uy,et al.  A Deep Learning Based Method for Handling Imbalanced Problem in Network Traffic Classification , 2017, SoICT.

[33]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Reza Bosagh Zadeh,et al.  TensorFlow for Deep Learning: From Linear Regression to Reinforcement Learning , 2018 .

[35]  Pierre Baldi,et al.  Understanding Dropout , 2013, NIPS.

[36]  Ying Tan,et al.  Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN , 2017, DMBD.

[37]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[39]  Sebastian Garcia,et al.  THE NETWORK BEHAVIOUR OF MALWARE TO BLOCK MALICIOUS PATTERNS . THE STRATOSPHERE PROJECT : A BEHAVIOURAL IPS , 2016 .

[40]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[41]  Gilbert Held,et al.  The TCP/IP Protocol Suite , 2001 .

[42]  Jie Li,et al.  Dynamic Traffic Feature Camouflaging via Generative Adversarial Networks , 2019, 2019 IEEE Conference on Communications and Network Security (CNS).

[43]  Michael Y. Hu,et al.  Effect of data standardization on neural network training , 1996 .

[44]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .

[45]  Wing Ning Li,et al.  Feedforward backpropagation artificial neural networks on reconfigurable meshes , 1998, Future Gener. Comput. Syst..

[46]  Colin J. Fidge,et al.  A Comparison of Supervised Machine Learning Algorithms for Classification of Communications Network Traffic , 2017, ICONIP.

[47]  Andreas Hotho,et al.  Flow-based Network Traffic Generation using Generative Adversarial Networks , 2018, Comput. Secur..

[48]  Fan Zhang,et al.  Inferring users' online activities through traffic analysis , 2011, WiSec '11.

[49]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[50]  Guanglu Sun,et al.  Internet Traffic Classification Based on Incremental Support Vector Machines , 2018, Mob. Networks Appl..

[51]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[52]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[53]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[54]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[55]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[56]  Adriel Cheng,et al.  PAC-GAN: Packet Generation of Network Traffic using Generative Adversarial Networks , 2019, 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).