GANFuzz: a GAN-based industrial network protocol fuzzing framework

In this paper, we attempt to improve industrial safety from the perspective of communication security. We leverage the protocol fuzzing technology to reveal errors and vulnerabilities inside implementations of industrial network protocols(INPs). Traditionally, to effectively conduct protocol fuzzing, the test data has to be generated under the guidance of protocol grammar, which is built either by interpreting the protocol specifications or reverse engineering from network traces. In this study, we propose an automated test case generation method, in which the protocol grammar is learned by deep learning. Generative adversarial network(GAN) is employed to train a generative model over real-world protocol messages to enable us to learn the protocol grammar. Then we can use the trained generative model to produce fake but plausible messages, which are promising test cases. Based on this approach, we present an automatical and intelligent fuzzing framework(GANFuzz) for testing implementations of INPs. Compared to prior work, GANFuzz offers a new way for this problem. Moreover, GANFuzz does not rely on protocol specification, so that it can be applied to both public and proprietary protocols, which outperforms many previous frameworks. We use GANFuzz to test several simulators of the Modbus-TCP protocol and find some errors and vulnerabilities.

[1]  Yoshua Bengio,et al.  Maximum-Likelihood Augmented Discrete Generative Adversarial Networks , 2017, ArXiv.

[2]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[3]  Anup K. Ghosh,et al.  An Approach for Analyzing the Robustness of Windows NT Software , 1998 .

[4]  Richard McNally,et al.  Fuzzing: The State of the Art , 2012 .

[5]  Anup K. Ghosh,et al.  Testing the robustness of Windows NT software , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[6]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[7]  Adam Kiezun,et al.  Grammar-based whitebox fuzzing , 2008, PLDI '08.

[8]  Kevin C. Almeroth,et al.  SNOOZE: Toward a Stateful NetwOrk prOtocol fuzZEr , 2006, ISC.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  James P. Crutchfield,et al.  Hidden Markov Models for Automated Protocol Learning , 2010, SecureComm.

[11]  Barton P. Miller,et al.  Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services , 1995 .

[12]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[13]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[14]  Helen J. Wang,et al.  Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.

[15]  Christopher Krügel,et al.  Automatic Network Protocol Analysis , 2008, NDSS.

[16]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[20]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Christopher Krügel,et al.  Prospex: Protocol Specification Extraction , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[23]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[24]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Matt J. Kusner,et al.  GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution , 2016, ArXiv.

[26]  Xuxian Jiang,et al.  Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution , 2008, NDSS.

[27]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[28]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..