User behavior based traffic emulator: A framework for generating test data for DPI tools

Abstract Deep Packet Inspection (DPI) engines rely highly on the operation environment i.e., the traffic mix they supposed to work with. A well performing DPI engine requires real-world traffic mixes to be tested on. Due to privacy issues real-world traffic is usually only available at the site of the network operator at a secure measurement point. Furthermore, in order to make signature update, performance tweaks, etc. of the DPI engine, real-like measurements are essential. In this paper we present a traffic generation framework that provides up-to-date traffic mixes continuously. The basic idea of the framework is to generate traffic based on automatic user behavior emulation. Real-world traffic measurements are processed to analyze and extract the most typical user behavior scenarios. Our proposed method uses these typical user behaviors for emulation of users on remote controlled hosts while the network traffic of the user equipment is recorded. As a final step, the framework can build high-speed multiplexed traces from the recorded data which mimic the behavior of real traffic. The characteristics of the constructed traffic compared to real world traffic measurements are also evaluated in the paper showing that the framework is able to generate realistic traffic traces that are both suitable for DPI testing and can be publicly distributed without any privacy concerns. The proof of concept implementation of the presented system is open to the public [1].

[1]  Paul Barford,et al.  Self-configuring network traffic generation , 2004, IMC '04.

[2]  Amin Vahdat,et al.  Swing: realistic and responsive network traffic generation , 2009, TNET.

[3]  Tao Ye,et al.  Divide and conquer: PC-based packet trace replay at OC-48 speeds , 2005, First International Conference on Testbeds and Research Infrastructures for the DEvelopment of NeTworks and COMmunities.

[4]  István Szabó,et al.  On the Validation of Traffic Classification Algorithms , 2008, PAM.

[5]  Ke Xu,et al.  AutoSig-Automatically Generating Signatures for Applications , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[6]  Sándor Molnár,et al.  How to validate traffic generators? , 2013, 2013 IEEE International Conference on Communications Workshops (ICC).

[7]  Antonio Pescapè,et al.  Searching for invariants in network games traffic , 2006, CoNEXT '06.

[8]  Jens Myrup Pedersen,et al.  Volunteer-based system for classification of traffic in computer networks , 2011, 2011 19thTelecommunications Forum (TELFOR) Proceedings of Papers.

[9]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[10]  Richard Nelson,et al.  Measuring the accuracy of open-source payload-based traffic classifiers using popular Internet applications , 2013, 38th Annual IEEE Conference on Local Computer Networks - Workshops.

[11]  S. Giordano,et al.  BRUNO: A high performance traffic generator for network processor , 2008, 2008 International Symposium on Performance Evaluation of Computer and Telecommunication Systems.

[12]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[13]  Antonio Pescapè,et al.  Systematic Performance Modeling and Characterization of Heterogeneous IP Networks , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[14]  Antonio Pescapè,et al.  A tool for the generation of realistic network workload for emerging networking scenarios , 2012, Comput. Networks.

[15]  Michele C. Weigle,et al.  Tmix: a tool for generating realistic TCP application workloads in ns-2 , 2006, CCRV.

[16]  Judith Kelner,et al.  A Survey on Internet Traffic Identification , 2009, IEEE Communications Surveys & Tutorials.

[17]  Wu-chi Feng,et al.  A traffic characterization of popular on-line games , 2005, IEEE/ACM Transactions on Networking.

[18]  Sándor Molnár,et al.  Finding Typical Internet User Behaviors , 2012, EUNICE.

[19]  Sándor Molnár,et al.  Multi-functional emulator for traffic analysis , 2013, 2013 IEEE International Conference on Communications (ICC).

[20]  Wu-chi Feng,et al.  TCPivo: a high-performance packet replay engine , 2003, MoMeTools '03.

[21]  Patrice Abry,et al.  Wavelets for the Analysis, Estimation, and Synthesis of Scaling Data , 2002 .

[22]  Charles V. Wright,et al.  Generating Client Workloads and High-Fidelity Network Traffic for Controllable, Repeatable Experiments in Computer Security , 2010, RAID.

[23]  Sándor Molnár,et al.  Automatic protocol signature generation framework for deep packet inspection , 2011, VALUETOOLS.

[24]  Peter Megyesi,et al.  Multi-functional traffic generation framework based on accurate user behavior emulation , 2013, 2013 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[25]  Sally Floyd,et al.  Difficulties in simulating the internet , 2001, TNET.

[26]  Pere Barlet-Ros,et al.  Independent comparison of popular DPI tools for traffic classification , 2015, Comput. Networks.