Remodeling and Simulation of Intrusion Detection Evaluation Dataset

Although the intrusion detection system (IDS) industry is rapidly maturing, the state of intrusion detection system evaluation is not. Although the off-line dataset evaluation proposed by MIT Lincoln Lab represents a significant undertaking, there remain several issues unsolved in design and modeling of the resulting dataset which may make the evaluation results biased. In this paper we present our efforts to improve on the traffic simulation. Unlike the existing model, our model takes advantage of user-level web mining, automatic user profiling and Enron email dataset etc, which is more reasonable for traffic modeling and simulation. The high fidelity of simulated traffic is shown in experiment. Moreover, different kinds of attacker personalities are profiled and more than 300 instances of 62 different automated attacks are launched against victim hosts and servers. All our efforts try to make the dataset more “real” and therefore be fairer for IDS evaluation.