Real-time identification of three Tor pluggable transports using machine learning techniques

Tor is a widespread network for anonymity over the Internet. Network owners try to identify and block Tor flows. On the other side, Tor developers enhance flow anonymity with various plugins. Tor and its plugins can be detected by deep packet inspection (DPI) methods. However, DPI-based solutions are computation intensive, need considerable human effort, and usually are hard to maintain and update. These issues limit the application of DPI methods in practical scenarios. As an alternative, we propose to use machine learning-based techniques that automatically learn from examples and adapt to new data whenever required. We report an empirical study on detection of three widely used Tor pluggable transports, namely Obfs3, Obfs4, and ScrambleSuit using four learning algorithms. We investigate the performance of Adaboost and Random Forests as two ensemble methods. In addition, we study the effectiveness of SVM and C4.5 as well-known parametric and nonparametric classifiers. These algorithms use general statistics of first few packets of the inspected flows. Experimental results conducted on real traffics show that all the adopted algorithms can perfectly detect the desired traffics by only inspecting first 10–50 packets. The trained classifiers can readily be employed in modern network switches and intelligent traffic monitoring systems.

[1]  Milton L. Mueller,et al.  The end of the net as we know it? Deep packet inspection and internet governance , 2011, New Media Soc..

[2]  Peter Hannay,et al.  Using Traffic Analysis to Identify the Second Generation Onion Router , 2011, 2011 IFIP 9th International Conference on Embedded and Ubiquitous Computing.

[3]  A. Nur Zincir-Heywood,et al.  Traffic flow analysis of tor pluggable transports , 2015, 2015 11th International Conference on Network and Service Management (CNSM).

[4]  Hui Dong,et al.  A hybrid method for network traffic classification , 2013, Proceedings of 2013 2nd International Conference on Measurement, Information and Control.

[5]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[6]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[7]  Philipp Winter,et al.  ScrambleSuit: a polymorphic network protocol to circumvent censorship , 2013, WPES.

[8]  Dongsheng Wang,et al.  An Novel Hybrid Method for Effectively Classifying Encrypted Traffic , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.

[9]  Stefan Lindskog,et al.  How China Is Blocking Tor , 2012, ArXiv.

[10]  Aditya Akella,et al.  Seeing through Network-Protocol Obfuscation , 2015, CCS.

[11]  Nick Feamster,et al.  Examining How the Great Firewall Discovers Hidden Circumvention Servers , 2015, Internet Measurement Conference.

[12]  Gang Xiong,et al.  A de-anonymize attack method based on traffic analysis , 2013, 2013 8th International Conference on Communications and Networking in China (CHINACOM).

[13]  Judith Kelner,et al.  Deep packet inspection tools and techniques in commodity platforms: Challenges and trends , 2012, J. Netw. Comput. Appl..

[14]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[15]  Xiaogang Wang,et al.  How to block Tor’s hidden bridges: detecting methods and countermeasures , 2012, The Journal of Supercomputing.

[16]  Robert C. Atkinson,et al.  Machine Learning Approach for Detection of nonTor Traffic , 2017, ARES.