Protocol State Machine Reverse Engineering with a Teaching-Learning Approach

In this work, we propose a novel solution to the problem of inferring the state machine of an unknown protocol. We extend and improve prior results on inferring Mealy machines, and present a new algorithm that accesses and interacts with a networked system that runs the unknown protocol in order to infer the Mealy machine representing the protocol's state machine. To demonstrate the viability of our approach, we provide an implementation and illustrate the operation of our algorithm on a simple example protocol, as well as on two real-world protocols, Modbus and MQTT.

[1]  Marc Dacier,et al.  ScriptGen: an automated script generation tool for Honeyd , 2005, 21st Annual Computer Security Applications Conference (ACSAC'05).

[2]  Roland Groz,et al.  Inferring Mealy Machines , 2009, FM.

[3]  Shunzheng Yu,et al.  Position-based automatic reverse engineering of network protocols , 2013, J. Netw. Comput. Appl..

[4]  Christopher Krügel,et al.  Prospex: Protocol Specification Extraction , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[5]  Dawn Xiaodong Song,et al.  Inference and analysis of formal models of botnet command and control protocols , 2010, CCS '10.

[6]  AngluinDana Learning regular sets from queries and counterexamples , 1987 .

[7]  Li Guo,et al.  Inferring Protocol State Machine from Network Traces: A Probabilistic Approach , 2011, ACNS.

[8]  Julien Duchêne,et al.  State of the art of network protocol reverse engineering tools , 2016, Journal of Computer Virology and Hacking Techniques.

[9]  Marco Mellia,et al.  Towards automatic protocol field inference , 2016, Comput. Commun..

[10]  Li Guo,et al.  A semantics aware approach to automated reverse engineering unknown protocols , 2012, 2012 20th IEEE International Conference on Network Protocols (ICNP).

[11]  Paulo Veríssimo,et al.  ReverX: Reverse Engineering of Protocols , 2011 .

[12]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[13]  Randy H. Katz,et al.  Protocol-Independent Adaptive Replay of Application Dialog , 2006, NDSS.

[14]  David Lee,et al.  Principles and methods of testing finite state machines-a survey , 1996, Proc. IEEE.

[15]  Gergo Lodi,et al.  Message Format and Field Semantics Inference for Binary Protocols Using Recorded Network Traffic , 2018, 2018 26th International Conference on Software, Telecommunications and Computer Networks (SoftCOM).

[16]  Guillaume Hiet,et al.  Towards automated protocol reverse engineering using semantic information , 2014, AsiaCCS.

[17]  Helen J. Wang,et al.  Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.

[18]  Dawn Xiaodong Song,et al.  MACE: Model-inference-Assisted Concolic Exploration for Protocol and Vulnerability Discovery , 2011, USENIX Security Symposium.

[19]  Li Guo,et al.  Biprominer: Automatic Mining of Binary Protocol Features , 2011, 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies.

[20]  Sandeep K. Shukla,et al.  A Survey of Automatic Protocol Reverse Engineering Tools , 2015, ACM Comput. Surv..