A multi-granularity heuristic-combining approach for censorship circumvention activity identification

Identifying censorship circumvention network traffic has become an important task for preventing abuse of those tools. However, traditional flow-based methods have drawbacks in high false positive rate, and they fail to exploit useful hidden features. In this paper, we propose a novel feature extraction method for censorship circumvention activity identification, which extracts features from multi-granularity, and it uses a heuristic-combining approach to make the final decision. Moreover, unlike traditional approaches, which classify on an individual flow or a packet, the proposed method examines on a new granularity. We present an implementation based on the proposed method, and the results are presented to demonstrate the effectiveness of our method. In comparison to the traditional flow-based methods, the proposed strategy has a slightly lower overall accuracy rate than flow-based approaches; however, its average false positive rate is significantly lower than the traditional method. Copyright © 2016 John Wiley & Sons, Ltd.

[1]  Junzhou Luo,et al.  Online Identification of Tor Anonymous Communication Traffic: Online Identification of Tor Anonymous Communication Traffic , 2014 .

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Ke Xu,et al.  NTRS: A FSM-Based Tra c Identification System , 2009 .

[4]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[5]  David Chaum,et al.  Untraceable electronic mail, return addresses, and digital pseudonyms , 1981, CACM.

[6]  J. Boyan DATA AND INFORMATION COLLECTION ON THE NET The Anonymizer Protecting User Privacy on the Web , 1997 .

[7]  Ramzi A. Haraty,et al.  I2P Data Communication System , 2011, ICON 2011.

[8]  Antonio Pescapè,et al.  Traffic identification engine: an open platform for traffic classification , 2014, IEEE Network.

[9]  Shuzhuang Zhang,et al.  P2P Traffic Identification Based on Host and Flow Behaviour Characteristics , 2013 .

[10]  John G. Palfrey,et al.  2007 Circumvention landscape report: methods, uses, and tools , 2009 .

[11]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[12]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[13]  John C. S. Lui,et al.  Profiling and identification of P2P traffic , 2009, Comput. Networks.

[14]  Yanghee Choi,et al.  NeTraMark: a network traffic classification benchmark , 2011, CCRV.

[15]  Jean Goubault-Larrecq Detecting Subverted Cryptographic Protocols by Entropy Checking , 2006 .

[16]  Nigel Williams netAI: network traffic based application identifier , 2006 .

[17]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[18]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[19]  Muhammad N. Marsono,et al.  Analysis of features selection for P2P traffic detection using support vector machine , 2013, 2013 International Conference of Information and Communication Technology (ICoICT).

[20]  Christian Callegari,et al.  Waterfall: Rapid Identification of IP Flows Using Cascade Classification , 2014, CN.

[21]  Eric C. Price,et al.  Browser-Based Attacks on Tor , 2007, Privacy Enhancing Technologies.

[22]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[23]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[24]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[25]  Xiaohui Fan,et al.  HFBP: Identifying P2P Traffic by Host Level and Flow Level Behavior Profiles , 2013, J. Networks.