A Hybrid Method based on Statistical Features and Packet Content Analysis to Identify Major Network Tunneling Protocols

Network traffic identification is an essential component for effective network analysis and management. Signature-based and machine learning techniques are the two most important methods in network traffic analysis. Due to the strengths and weaknesses of these two approaches, their combination can strengthen them and remove the weaknesses of each in detection process. In this article, a hybrid method is introduced, to identify major network tunneling protocols. This method can detect the well-known tunneling protocols by combining signature-based methods and statistical analysis techniques through a clustering algorithm. In this proposed method, the clustering process is refined by the feedback of signature-base method. Since, in semi-supervised clustering, it is important to gain most informative data to improve the clustering performance, in the proposed clustering method, a new active learning approach is introduced for selecting informative constraints. In this hybrid method, four tunneling protocols (L2TP, PPTP, IPsec and OpenVPN) are applied. The obtained results indicate that this proposed hybrid method significantly increases accuracy and cluster purity, and these protocols are identified with high accuracy and low processing cost.

[1]  Hai Huang,et al.  Traffic Classification Method by Combination of Host Behaviour and Statistical Approach , 2014 .

[2]  S. Agarwal,et al.  Notice of Violation of IEEE Publication PrinciplesK-means versus k-means ++ clustering technique , 2012, 2012 Students Conference on Engineering and Systems.

[3]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[4]  Fang Yang,et al.  Research of Application Protocol Identification System Based DPI and DFI , 2012 .

[5]  Lin Wang,et al.  A novel semi-supervised learning method for Internet application identification , 2017, Soft Comput..

[6]  Dan Pelleg,et al.  K -Means with Large and Noisy Constraint Sets , 2007, ECML.

[7]  Marwan A. Al-Namari,et al.  Internet traffic classification using machine learning approach: Datasets validation issues , 2016, 2016 Conference of Basic Sciences and Engineering Studies (SGCAC).

[8]  Charles V. Wright,et al.  Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis , 2009, NDSS.

[9]  Maurizio Martinelli,et al.  nDPI: Open-source high-speed deep packet inspection , 2014, 2014 International Wireless Communications and Mobile Computing Conference (IWCMC).

[10]  Pere Barlet-Ros,et al.  Extended Independent Comparison of Popular Deep Packet Inspection (DPI) Tools for Traffic Classification , 2014 .

[11]  Kiri Wagstaff,et al.  Value, Cost, and Sharing: Open Issues in Constrained Clustering , 2006, KDID.

[12]  William Stallings,et al.  Cryptography and Network Security: Principles and Practice , 1998 .

[13]  Ciprian Dobre,et al.  Internet traffic classification based on flows' statistical properties with machine learning , 2017, Int. J. Netw. Manag..

[14]  Jun Zhang,et al.  A novel semi-supervised approach for network traffic clustering , 2011, 2011 5th International Conference on Network and System Security.

[15]  Riyad Alshammari,et al.  Unveiling Skype encrypted tunnels using GP , 2010, IEEE Congress on Evolutionary Computation.

[16]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[17]  Yuehui Chen,et al.  IMPROVING NEURAL NETWORK CLASSIFICATION USING FURTHER DIVISION OF RECOGNITION SPACE , 2007 .

[18]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[19]  Li Guo,et al.  Fast and Memory-Efficient Traffic Classification with Deep Packet Inspection in CMP Architecture , 2010, 2010 IEEE Fifth International Conference on Networking, Architecture, and Storage.

[20]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[21]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[22]  Carey L. Williamson,et al.  A Longitudinal Study of P2P Traffic Classification , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[23]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[24]  Wujian Ye,et al.  P2P and P2P botnet traffic classification in two stages , 2015, Soft Computing.

[25]  Shaneel Narayan,et al.  Network performance comparison of VPN protocols on wired and wireless networks , 2015, 2015 International Conference on Computer Communication and Informatics (ICCCI).

[26]  Bo Yang,et al.  Data gravitation based classification , 2009, Inf. Sci..

[27]  Yu Wang,et al.  Semi-supervised Encrypted Traffic Classification Using Composite Features Set , 2012, J. Networks.

[28]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[29]  R. Subhashini,et al.  An overview of learning in data streams with label scarcity , 2016, 2016 International Conference on Inventive Computation Technologies (ICICT).

[30]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[31]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[32]  Jun Zhang,et al.  Internet traffic clustering with side information , 2014, J. Comput. Syst. Sci..

[33]  Rong Jin,et al.  Active query selection for semi-supervised clustering , 2008, 2008 19th International Conference on Pattern Recognition.

[34]  Tomasz Bujlow Classification and Analysis of Computer Network Traffic , 2014 .

[35]  Ali Fanian,et al.  Tunneling protocols identification using light packet inspection , 2015, 2015 12th International Iranian Society of Cryptology Conference on Information Security and Cryptology (ISCISC).

[36]  David Thaler,et al.  Security Concerns with IP Tunneling , 2011, RFC.

[37]  Dongsheng Wang,et al.  An Novel Hybrid Method for Effectively Classifying Encrypted Traffic , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.