Large-Scale Realistic Network Data Generation on a Budget

Many novel problems in computer networking require relevant network trace data during the research process. Unfortunately, such data can often be hard to find, which becomes a problem within itself. While generating appropriate data using in-lab network testbeds and simulators are feasible solutions, the former has limitations in terms of network scale, while the latter has limitations in the generated data. To help address these issues, we present an approach for the generation of realistic network trace data in a contained, large-scale network environment. We use network emulation to enable large-scale, in-lab networking, and a software framework we developed to support autonomous client-side protocols and services, including user-behavioral models which scale in a shared CPU environment. Our framework also enables quick experiment setup and monitoring. We show through experimentation on a low-end laptop that our approach enables network scale into the hundreds of nodes, allowing anyone with even basic hardware to generate potentially relevant, realistic network data.

[1]  Z. Morley Mao,et al.  On the impact of research network based testbeds on wide-area experiments , 2006, IMC '06.

[2]  Jiankun Hu,et al.  Generating realistic intrusion detection system dataset based on fuzzy qualitative modeling , 2017, J. Netw. Comput. Appl..

[3]  Maurizio Casoni,et al.  On the effectiveness of Linux containers for network virtualization , 2013, Simul. Model. Pract. Theory.

[4]  Danny Bickson,et al.  Everlab: A Production Platform for Research in Network Experimentation and Computation , 2007, LISA.

[5]  Patrick Tague,et al.  Isolation of Multiple Anonymous Attackers in Mobile Networks , 2015, NSS.

[6]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[7]  Kannan Srinivasan,et al.  Modeling Online Browsing and Path Analysis Using Clickstream Data , 2004 .

[8]  M. Bateson,et al.  Cues of being watched enhance cooperation in a real-world setting , 2006, Biology Letters.

[9]  Amin Vahdat,et al.  Realistic and responsive network traffic generation , 2006, SIGCOMM 2006.

[10]  Ali A. Ghorbani,et al.  Botnet detection based on traffic behavior analysis and flow intervals , 2013, Comput. Secur..

[11]  Andreas Haeberlen,et al.  Challenges in Experimenting with Botnet Detection Systems , 2011, CSET.

[12]  Jeff Ahrenholz Comparison of CORE network emulation platforms , 2010, 2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE.

[13]  Brian Adamson,et al.  Integration of the CORE and EMANE Network Emulators , 2011, 2011 - MILCOM 2011 Military Communications Conference.

[14]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[15]  Larry L. Peterson,et al.  Experiences building PlanetLab , 2006, OSDI '06.

[16]  Rudolf Hornig,et al.  An overview of the OMNeT++ simulation environment , 2008, Simutools 2008.

[17]  Jun Hong,et al.  Using Markov Chains for Link Prediction in Adaptive Web Sites , 2002, Soft-Ware.

[18]  George F. Riley,et al.  The ns-3 Network Simulator , 2010, Modeling and Tools for Network Simulation.