Generating Client Workloads and High-Fidelity Network Traffic for Controllable, Repeatable Experiments in Computer Security

Rigorous scientific experimentation in system and network security remains an elusive goal. Recent work has outlined three basic requirements for experiments, namely that hypotheses must be falsifiable, experiments must be controllable, and experiments must be repeatable and reproducible. Despite their simplicity, these goals are difficult to achieve, especially when dealing with client-side threats and defenses, where often user input is required as part of the experiment. In this paper, we present techniques for making experiments involving security and client-side desktop applications like web browsers, PDF readers, or host-based firewalls or intrusion detection systems more controllable and more easily repeatable. First, we present techniques for using statistical models of user behavior to drive real, binary, GUI-enabled application programs in place of a human user. Second, we present techniques based on adaptive replay of application dialog that allow us to quickly and efficiently reproduce reasonable mock-ups of remotely-hosted applications to give the illusion of Internet connectedness on an isolated testbed. We demonstrate the utility of these techniques in an example experiment comparing the system resource consumption of a Windows machine running anti-virus protection versus an unprotected system.

[1]  Dongho Kim,et al.  Experience with DETER: a testbed for security research , 2006, 2nd International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, 2006. TRIDENTCOM 2006..

[2]  Rob Miller,et al.  Sikuli: using GUI screenshots for search and automation , 2009, UIST '09.

[3]  Amin Vahdat,et al.  Realistic and responsive network traffic generation , 2006, SIGCOMM.

[4]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[5]  Helmut Hlavacs,et al.  Workload Generation by Modelling User Behavior in an ISP Subnet , 2001 .

[6]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[7]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[8]  Dennis J. Turner,et al.  Symantec Internet Security Threat Report Trends for July 04-December 04 , 2005 .

[9]  Mike Hibler,et al.  Large-scale Virtualization in the Emulab Network Testbed , 2008, USENIX ATC.

[10]  V. Paxson,et al.  GQ : Realizing a System to Catch Worms in a Quarter Million Places , 2006 .

[11]  Xuxian Jiang,et al.  Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities , 2006, NDSS.

[12]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[13]  Dejan Kostic,et al.  Scalability and accuracy in a large-scale network emulator , 2002, CCRV.

[14]  Niels Provos,et al.  To Catch a Predator: A Natural Language Approach for Eliciting Malicious Payloads , 2008, USENIX Security Symposium.

[15]  Paul Barford,et al.  Self-configuring network traffic generation , 2004, IMC '04.

[16]  Vinod Yegneswaran,et al.  A framework for malicious workload generation , 2004, IMC '04.

[17]  George F. Riley,et al.  Empirical Models of TCP and UDP End-User Network Traffic from NETI@home Data Analysis , 2006, 20th Workshop on Principles of Advanced and Distributed Simulation (PADS'06).

[18]  Jesse C. Rabek,et al.  LARIAT: Lincoln adaptable real-time information assurance testbed , 2002, Proceedings, IEEE Aerospace Conference.

[19]  Paul Barford,et al.  Bench-style network research in an Internet Instance Laboratory , 2003, CCRV.

[20]  Kevin A. Kwiat,et al.  USim: a user behavior simulation framework for training and testing IDSes in GUI based systems , 2006, 39th Annual Simulation Symposium (ANSS'06).

[21]  Jin Cao,et al.  Stochastic models for generating synthetic HTTP source traffic , 2004, IEEE INFOCOM 2004.

[22]  Paul Barford,et al.  Toward Comprehensive Traffic Generation for Online IDS Evaluation , 2005 .

[23]  Shyhtsun Felix Wu,et al.  On Interactive Internet Traffic Replay , 2005, RAID.

[24]  Ross J. Anderson,et al.  The snooping dragon: social-malware surveillance of the Tibetan movement , 2009 .

[25]  Nick Feamster,et al.  In VINI veritas: realistic and controlled network experimentation , 2006, SIGCOMM 2006.

[26]  Sneha Kumar Kasera,et al.  The Flexlab Approach to Realistic Evaluation of Networked Systems , 2007, NSDI.

[27]  Niels Provos,et al.  All Your iFRAMEs Point to Us , 2008, USENIX Security Symposium.

[28]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[29]  Kun-Chan Lan,et al.  Rapid model parameterization from traffic measurements , 2002, TOMC.

[30]  Galen C. Hunt,et al.  Detours: binary interception of Win32 functions , 1999 .

[31]  Giovanni Vigna,et al.  An experience developing an IDS stimulator for the black-box testing of network intrusion detection systems , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[32]  A. Nur Zincir-Heywood,et al.  Generating representative traffic for intrusion detection system benchmarking , 2005, 3rd Annual Communication Networks and Services Research Conference (CNSR'05).

[33]  Shaowen Song,et al.  A credit-based flow control algorithm for broadband access networks , 2000, Comput. Networks.

[34]  Michele C. Weigle,et al.  Tmix: a tool for generating realistic TCP application workloads in ns-2 , 2006, CCRV.

[35]  Jesse C. Boothe-Rabek WinNTGen : Creation of a Windows NT 5.0+ network traffic generator , 2003 .

[36]  Lee M. Rossey,et al.  Integrated Environment Management for Information Operations Testbeds , 2007, VizSEC.

[37]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[38]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[39]  Nick Feamster,et al.  In VINI veritas: realistic and controlled network experimentation , 2006, SIGCOMM.

[40]  Matt Bishop,et al.  How to Design Computer Security Experiments , 2007, World Conference on Information Security Education.

[41]  Randy H. Katz,et al.  Protocol-Independent Adaptive Replay of Application Dialog , 2006, NDSS.

[42]  Niels Provos,et al.  The Ghost in the Browser: Analysis of Web-based Malware , 2007, HotBots.