Identifying Tor Anonymous Traffic Based on Gravitational Clustering Analysis

The anonymous communication technology has brought new challenges to network monitoring. Effectively identify the anonymous traffic, plays a key role in preventing the abuse of such technology. In this paper, we propose a gravitational clustering algorithm (GCA) to identify the Tor anonymous traffic. Basically, each vector in the dataset is considered as an object in the feature space. And the objects are moved by using gravitational force and the second motion law. Compared with traditional method, our method could automatically identify the cluster number. And Clustering algorithm could help us discover the anonymous traffic among various unknown traffic types. We also make an empirical comparison of current state-of-the-art clustering algorithms. Experimental results show that our method has a better performance compared with other methods under the same experimental settings.

[1]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[2]  Micah Sherr,et al.  ExperimenTor: A Testbed for Safe and Realistic Tor Experimentation , 2011, CSET.

[3]  A. Nur Zincir-Heywood,et al.  A Comparison of three machine learning techniques for encrypted network traffic analysis , 2011, 2011 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[4]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[5]  Cao Yuan,et al.  An Alternating-Complementary Self-Recovering Method Based on Dual FSMs , 2012 .

[6]  Ramesh Govindan,et al.  ASTUTE: detecting a different class of traffic anomalies , 2010, SIGCOMM '10.

[7]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[8]  Charles Elkan,et al.  Expectation Maximization Algorithm , 2010, Encyclopedia of Machine Learning.

[9]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Li Guo,et al.  Using Entropy to Classify Traffic More Deeply , 2011, 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage.

[11]  Chuong B Do,et al.  What is the expectation maximization algorithm? , 2008, Nature Biotechnology.

[12]  Kensuke Fukuda,et al.  A Flow Analysis for Mining Traffic Anomalies , 2010, 2010 IEEE International Conference on Communications.