Profiling and identification of P2P traffic

Accurate identification of network applications is important for many network activities. The traditional port-based technique has become much less effective since many new applications no longer use well-known fixed port numbers. In this paper, we propose a novel profile-based approach to identifying traffic flows belonging to the target application. In contrast to the method used in previous studies, of classifying traffic based on statistics of individual flows, we build behavioral profiles of the target application, which describe dominant patterns in the application. Based on the behavior profiles, a two-level matching method is used to identify new traffic. We first determine whether a host participates in the target application by comparing its behavior with the profiles. Subsequently, we compare each flow of the host with those patterns in the application profiles to determine which flows belong to this application. We demonstrate the effectiveness of our method on-campus traffic traces. Our results show that one can identify popular P2P applications with very high accuracy.

[1]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[2]  Zhi-Li Zhang,et al.  Profiling internet backbone traffic: behavior models and applications , 2005, SIGCOMM '05.

[3]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[4]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[5]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[6]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[7]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[8]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[9]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[10]  Andrew B. Nobel,et al.  Statistical Clustering of Internet Communication Patterns , 2003 .

[11]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[12]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[13]  Elena Baralis,et al.  Data mining techniques for effective and scalable traffic analysis , 2005, 2005 9th IFIP/IEEE International Symposium on Integrated Network Management, 2005. IM 2005..

[14]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[15]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[16]  Michael K. Reiter,et al.  Finding Peer-to-Peer File-Sharing Using Coarse Network Behaviors , 2006, ESORICS.

[17]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[18]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[19]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[20]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[21]  Luca Salgarelli,et al.  A statistical approach to IP-level classification of network traffic , 2006, 2006 IEEE International Conference on Communications.

[22]  Dario Rossi,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM '07.

[23]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[24]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.