Automated Attacker Correlation for Malicious Code

Abstract : Correlating attacks can be specifically problematic in the digital domain. It is a common scenario that the only real "trace" of an attack that can be obtained is executable code. As such, executable code of malicious software forms one of the primary pieces of evidence that need to be examined in order to establish correlation between seemingly independent events/attacks. Due to the high technical sophistication required for building advanced and stealthy persistent backdoors ("rootkits"), it is quite common for code fragments to be re-used. A big obstacle to performing proper correlation between different executables is the high degree of variability which the compiler introduces when generating the final byte sequences. This paper presents the results of research on executable code comparison for attacker correlation. Instead of pursuing a byte-based approach, a structural approach is chosen. The result is a system that can identify code similarities in executables with accuracy that often exceeds that of a human analyst and at much higher speed.