Transport layer identification of P2P traffic

Since the emergence of peer-to-peer (P2P) networking in the late '90s, P2P applications have multiplied, evolved and established themselves as the leading `growth app' of Internet traffic workload. In contrast to first-generation P2P networks which used well-defined port numbers, current P2P applications have the ability to disguise their existence through the use of arbitrary ports. As a result, reliable estimates of P2P traffic require examination of packet payload, a methodological landmine from legal, privacy, technical, logistic, and fiscal perspectives. Indeed, access to user payload is often rendered impossible by one of these factors, inhibiting trustworthy estimation of P2P traffic growth and dynamics. In this paper, we develop a systematic methodology to identify P2P flows at the transport layer, i.e., based on connection patterns of P2P networks, and without relying on packet payload. We believe our approach is the first method for characterizing P2P traffic using only knowledge of network dynamics rather than any user payload. To evaluate our methodology, we also develop a payload technique for P2P traffic identification, by reverse engineering and analyzing the nine most popular P2P protocols, and demonstrate its efficacy with the discovery of P2P protocols in our traces that were previously unknown to us. Finally, our results indicate that P2P traffic continues to grow unabatedly, contrary to reports in the popular media.

[1]  George C. Polyzos,et al.  A Parameterizable Methodology for Internet Traffic Flow Profiling , 1995, IEEE J. Sel. Areas Commun..

[2]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[3]  David Moore,et al.  The CoralReef Software Suite as a Tool for System and Network Administrators , 2001, LISA.

[4]  kc claffy,et al.  The architecture of CoralReef: an Internet traffic monitoring software suite , 2001 .

[5]  Nathaniel Leibowitz,et al.  ARE FILE SWAPPING NETWORKS CACHEABLE? CHARACTERIZING P2P TRAFFIC , 2002 .

[6]  Chase Cotton,et al.  Packet-level traffic measurements from the Sprint IP backbone , 2003, IEEE Netw..

[7]  Stefan Savage,et al.  Understanding Availability , 2003, IPTPS.

[8]  W. Norton The Evolution of the U.S. Internet Peering Ecosystem , 2003 .

[9]  Michalis Faloutsos,et al.  File-sharing in the Internet: A characterization of P2P traffic in the backbone , 2003 .

[10]  Ellen W. Zegura,et al.  Bootstrapping in Gnutella: A Measurement Study , 2004, PAM.

[11]  Kimberly C. Claffy,et al.  Their Share: Diversity and Disparity in IP Traffic , 2004, PAM.

[12]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2002, IMW '02.

[13]  Mostafa H. Ammar,et al.  Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme , 2004, Comput. Networks.

[14]  Mikel Izal,et al.  Dissecting BitTorrent: Five Months in a Torrent's Lifetime , 2004, PAM.

[15]  Kurt Tutschku,et al.  A Measurement-Based Traffic Profile of the eDonkey Filesharing Service , 2004, PAM.

[16]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[17]  Christos Gkantsidis,et al.  Random walks in peer-to-peer networks , 2004, IEEE INFOCOM 2004.

[18]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.