LZfuzz: a fast compression-based fuzzer for poorly documented protocols

Computers make very fast, very accurate mistakes. From a refrigerator magnet. Real-world infrastructure offers many scenarios where protocols (and other details) are not released due to being considered too sensitive or for other reasons. This situation makes it hard to apply fuzzing techniques to test their security and reliability, since their full documentation is only available to their developers, and domain developer expertise does not necessarily intersect with fuzz-testing expertise (nor deployment responsibility). State-of-the-art fuzzing techniques, however, work best when protocol specifications are available. Still, operators whose networks include equipment communicating via proprietary protocols should be able to reap the benefits of fuzz-testing them. In particular, administrators should be able to test proprietary protocols in the absence of end-to-end application-level encryption to understand whether they can withstand injection of bad traffic, and thus be able to plan adequate network protection measures. Such protocols can be observed in action prior to fuzzing, and packet captures can be used to learn enough about the structure of the protocol to make fuzzing more efficient. Various machine learning approaches, e.g. bioinformatics methods, have been proposed for learning models of the targeted protocols. The problem with most of these approaches to date is that, although sometimes quite successful, they are very computationally heavy and thus are hardly practical for application by network administrators and equipment owners who cannot easily dedicate a compute cluster to such tasks. We propose a simple method that, despite its roughness, allowed us to learn facts useful for fuzzing from protocol traces at much smaller CPU and time costs. Our fuzzing approach proved itself empirically in testing actual proprietary SCADA protocols in an isolated control network test environment, and was also successful in triggering flaws in implementations of several popular commodity Internet protocols. Our fuzzer, LZfuzz (pronounced “lazy-fuzz”) relies on a variant of Lempel–Ziv compression algorithm to guess boundaries between the structural units of the protocol, and builds on the well-known free software GPF fuzzer.