Generating Statistic Application Signatures for Inference of Unknown Applications

In this paper, we propose a novel approach of protocol reverse engineering to extract protocol keywords of unknown application from raw network traffic data without a prior knowledge about the application based on compression theory, entropy and variance analysis. We also present an efficient method to generate statistic signature of unknown application leveraging machine learning and probabilistic models. The experiment results show that our approach extract protocol keywords of application in high accuracy, the false positive and false negative of application identification using our method are very low. Our technique can also discover new application in unknown traffic.

[1]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[2]  Zhenkai Liang,et al.  Polyglot: automatic extraction of protocol message format using dynamic binary analysis , 2007, CCS '07.

[3]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[4]  Yang Xiang,et al.  An automatic application signature construction system for unknown traffic , 2010 .

[5]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[6]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[7]  Wanlei Zhou,et al.  Generating regular expression signatures for network traffic classification in trusted network management , 2012, J. Netw. Comput. Appl..

[8]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[9]  Ke Xu,et al.  AutoSig-Automatically Generating Signatures for Applications , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[10]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[11]  Helen J. Wang,et al.  Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.