Emulation Environment for Ground Truth Establishment

Network security is a hot topic to network users and managers, whether they are institutional, enterprise or domestic. New threats or mutations of existing ones appear at a very fast rate and the solutions that are nowadays used to fight them frequently require a real-time analysis of the network traffic or a previous training based on real data. Most of the times, this training must be supervised by humans that, depending on their experience, can create security breaches in the system without knowing it. Since existing anomaly detection methodologies have to be trained and tested in order to validate their efficiency, there is an increasing need for trustworthy network traffic data that can be used without compromising users confidentiality, obeys to some pre-established criteria and is completely known in terms of its underlying protocols. In fact, the effectiveness of network anomaly detectors cannot be fully evaluated without having a complete control of the entire evaluation experiment, which requires that it should be possible to change the location, magnitude and type of individual anomalies and background traffic. In this work, we propose an emulation environment that can be used to obtain trustworthy network data both in the presence of licit and illicit applications. We also present some topological and traffic scenarios that were already defined to start gathering network data and make it immediately available to the scientific community. The emulation environment was built in an evolutionary way, enabling the easy introduction of new network scenarios and services and/or the refinement of the existing ones. Keywords-Ground truth, emulation, licit and illicit applications.

[1]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[2]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[3]  Marco Canini,et al.  Experience with high-speed automated application-identification for network-management , 2009, ANCS '09.

[4]  Matthew Roughan,et al.  The need for simulation in evaluating anomaly detectors , 2008, CCRV.

[5]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[6]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[7]  István Szabó,et al.  On the Validation of Traffic Classification Algorithms , 2008, PAM.

[8]  Kavé Salamatian,et al.  Combining filtering and statistical methods for anomaly detection , 2005, IMC '05.

[9]  Marco Canini,et al.  Tracking elephant flows in internet backbone traffic with an FPGA-based cache , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[10]  Maurizio Dusi,et al.  Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting , 2009, Comput. Networks.

[11]  Florian Haemmerling Unconstrained Endpoint Profiling (Googling the Internet) , 2009 .

[12]  Christophe Diot,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM.

[13]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[14]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[15]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[16]  Virgílio A. F. Almeida,et al.  Characterizing a spam traffic , 2004, IMC '04.

[17]  Marco Canini,et al.  GTVS: Boosting the Collection of Application Traffic Ground Truth , 2009, TMA.