Unsupervised field segmentation of unknown protocol messages

Abstract In network security systems working on intrusion detection, deep packet inspection, and protocol fuzzing, protocol specifications analyzed by Protocol Reverse Engineering(PRE) play an important role as fundamental input. For binary protocols having fixed-length fields, the location of those field boundaries has great impact on the subsequent analysis as well as the final performance. In this paper, we discuss the field segmentation problem formally, and develop a reasonable method ProSeg by introducing and optimize statistics(self-information and mutual information) from Information Theory. By analyzing the format structure of messages from unknown protocol vertically, the boundaries of fixed-length fields could be located by an expert voting strategy successfully. In experiments and analysis on several common protocols, our method turns out to be effective relatively and the results of ProSeg are consistent with standard segmentations to a great extent.

[1]  Guillaume Hiet,et al.  Towards automated protocol reverse engineering using semantic information , 2014, AsiaCCS.

[2]  Paulo Veríssimo,et al.  Reverse Engineering of Protocols from Network Traces , 2011, 2011 18th Working Conference on Reverse Engineering.

[3]  Christian Rossow,et al.  ProVeX: Detecting Botnets with Encrypted Command and Control Channels , 2013, DIMVA.

[4]  Christopher Krügel,et al.  Prospex: Protocol Specification Extraction , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[5]  Li Chen,et al.  A Survey on Methods of Automatic Protocol Reverse Engineering , 2011, 2011 Seventh International Conference on Computational Intelligence and Security.

[6]  Hyunwoo Choi,et al.  Enabling Automatic Protocol Behavior Analysis for Android Applications , 2016, CoNEXT.

[7]  Sandeep K. Shukla,et al.  A Survey of Automatic Protocol Reverse Engineering Tools , 2015, ACM Comput. Surv..

[8]  Mark E DeYoung,et al.  Dynamic Protocol Reverse Engineering: A Grammatical Inference Approach , 2012 .

[9]  Neil Walkinshaw,et al.  Using Segment-Based Alignment to Extract Packet Structures from Network Traces , 2017, 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS).

[10]  Spiros Mancoridis,et al.  A Reverse Engineering Tool for Extracting Protocols of Networked Applications , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[11]  Randy H. Katz,et al.  Protocol-Independent Adaptive Replay of Application Dialog , 2006, NDSS.

[12]  Fanzhi Meng,et al.  Protocol reverse based on hierarchical clustering and probability alignment from network traces , 2018, 2018 IEEE 3rd International Conference on Big Data Analysis (ICBDA).

[13]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[14]  Dawn Xiaodong Song,et al.  Inference and analysis of formal models of botnet command and control protocols , 2010, CCS '10.

[15]  Helen J. Wang,et al.  Tupni: automatic reverse engineering of input formats , 2008, CCS.

[16]  Gaogang Xie,et al.  Toward Unsupervised Protocol Feature Word Extraction , 2014, IEEE Journal on Selected Areas in Communications.

[17]  Ernst W. Biersack,et al.  Traffic to protocol reverse engineering , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[18]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[19]  Thomas W. Reps,et al.  Extracting Output Formats from Executables , 2006, 2006 13th Working Conference on Reverse Engineering.

[20]  Jugal K. Kalita,et al.  Network defense: Approaches, methods and techniques , 2015, J. Netw. Comput. Appl..

[21]  Marco Mellia,et al.  Towards automatic protocol field inference , 2016, Comput. Commun..

[22]  Antonio Nucci,et al.  SANTaClass: A Self Adaptive Network Traffic Classification system , 2013, 2013 IFIP Networking Conference.

[23]  Helen J. Wang,et al.  Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.

[24]  Myung-Sup Kim,et al.  Inference of network unknown protocol structure using CSP(Contiguous Sequence Pattern) algorithm based on tree structure , 2018, NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.

[25]  Shunzheng Yu,et al.  Position-based automatic reverse engineering of network protocols , 2013, J. Netw. Comput. Appl..

[26]  Zhenkai Liang,et al.  Polyglot: automatic extraction of protocol message format using dynamic binary analysis , 2007, CCS '07.

[27]  Yu Shun Noise-Tolerant and Optimal Segmentation of Message Formats for Unknown Application-Layer Protocols , 2013 .

[28]  Bogdan Copos,et al.  InputFinder: Reverse Engineering Closed Binaries using Hardware Performance Counters , 2015, PPREW@ACSAC.

[29]  Siyu Tao,et al.  Bit-oriented format extraction approach for automatic binary protocol reverse engineering , 2016, IET Commun..

[30]  Jin Weidong,et al.  Reverse Engineering for UAV Control Protocol Based on Detection Data , 2017, 2017 2nd International Conference on Multimedia and Image Processing (ICMIP).

[31]  Bart Preneel,et al.  Securing Wireless Neurostimulators , 2018, CODASPY.

[32]  Yongdae Kim,et al.  Dissecting Customized Protocols: Automatic Analysis for Customized Protocols based on IEEE 802.15.4 , 2016, WISEC.

[33]  Dawn Xiaodong Song,et al.  Automatic protocol reverse-engineering: Message format extraction and field semantics inference , 2013, Comput. Networks.

[34]  Helen J. Wang,et al.  Shield: vulnerability-driven network filters for preventing known vulnerability exploits , 2004, SIGCOMM 2004.

[35]  Tong Li,et al.  Inferring protocol state machine for binary communication protocol , 2014, 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA).

[36]  Myung-Sup Kim,et al.  Survey on network protocol reverse engineering approaches, methods and tools , 2017, 2017 19th Asia-Pacific Network Operations and Management Symposium (APNOMS).

[37]  Myung-Sup Kim,et al.  Framework for precise protocol reverse engineering based on network traces , 2018, NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.

[38]  Hong Zheng Overviews on protocol reverse engineering , 2011 .

[39]  Li Guo,et al.  A semantics aware approach to automated reverse engineering unknown protocols , 2012, 2012 20th IEEE International Conference on Network Protocols (ICNP).