Experiences Benchmarking Intrusion Detection Systems

Intrusion Detection Systems (hereafter abbreviated as “IDS”) are a topic that has recently garnered much interest in the computer security community. As interest has grown, the topic of testing and benchmarking IDS has also received a great deal of attention. It has not, however, received a great deal of thought, since an embarrassingly large number of IDS “benchmarks” have proven to be so fundamentally flawed that they actually provide misleading information rather than useful results. In this paper, we discuss the topic of IDS benchmarking, and present a few examples of poor benchmarks and how they can be fixed. We also present some guidelines on how to design and test IDS effectively. Introduction: Benchmarks and Tests Constructing good benchmarks is difficult regardless of what they propose to measure. In order to accurately measure something complex, you often need to expend considerable effort in designing tests, to make sure that the tests aren’t inherently biased or inaccurate. This is especially difficult when you’re measuring something complex like an IDS that is highly dependent on its operating environment. Generally, when designing a test, you should determine first off whether you want to: a) Quantitatively measure varying systems against a predictable baseline