Modeling and Caching of Peer-to-Peer Traffic

Peer-to-peer (P2P) file sharing systems generate a major portion of the Internet traffic, and this portion is expected to increase in the future. We explore the potential of deploying proxy caches in different autonomous systems (ASes) with the goal of reducing the cost incurred by Internet service providers and alleviating the load on the Internet backbone. We conduct a measurement study to model the popularity of P2P objects in different ASes. Our study shows that the popularity of P2P objects can be modeled by a Mandelbrot-Zipf distribution, regardless of the AS. Guided by our findings, we develop a novel caching algorithm for P2P traffic that is based on object segmentation, and partial admission and eviction of objects. Our trace-based simulations show that with a relatively small cache size, less than 10% of the total traffic, a byte hit rate of up to 35% can be achieved by our algorithm, which is close to the byte hit rate achieved by an off-line optimal algorithm with complete knowledge of future requests. Our results also show that our algorithm achieves a byte hit rate that is at least 40% more, and at most triple, the byte hit rate of the common Web caching algorithms. Furthermore, our algorithm is robust in face of aborted downloads, which is a common case in P2P systems.

[1]  Adam Wierzbicki,et al.  Cache replacement policies revisited: the case of P2P traffic , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[2]  László Böszörményi,et al.  A survey of Web cache replacement strategies , 2003, CSUR.

[3]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[4]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[5]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[6]  Gerhard Weikum,et al.  Web Caching , 2003, Web & Datenbanken.

[7]  Z. K. Silagadze,et al.  Citations and the Zipf-Mandelbrot Law , 1999, Complex Syst..

[8]  Virgílio A. F. Almeida,et al.  On the intrinsic locality properties of Web reference streams , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[9]  Mary K. Vernon,et al.  Characterizing the query behavior in peer-to-peer file sharing systems , 2004, IMC '04.

[10]  Carey L. Williamson,et al.  ProWGen: a synthetic workload generation tool for simulation evaluation of web proxy caches , 2002, Comput. Networks.

[11]  Nathaniel Leibowitz,et al.  ARE FILE SWAPPING NETWORKS CACHEABLE? CHARACTERIZING P2P TRAFFIC , 2002 .

[12]  Jiangchuan Liu,et al.  Proxy caching for media streaming over the Internet , 2004, IEEE Communications Magazine.

[13]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[14]  Pablo Rodriguez,et al.  Should internet service providers fear peer-assisted content distribution? , 2005, IMC '05.

[15]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2002, IMW '02.

[16]  Rajeev Motwani,et al.  Modeling correlations in web traces and implications for designing replacement policies , 2004, Comput. Networks.

[17]  Krishna P. Gummadi,et al.  Measuring and analyzing the characteristics of Napster and Gnutella hosts , 2003, Multimedia Systems.

[18]  Duane Wessels,et al.  Web Caching , 2001 .

[19]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[20]  Daniel Stutzbach,et al.  Characterizing unstructured overlay topologies in modern P2P file-sharing systems , 2005 .

[21]  Nathaniel Leibowitz,et al.  ARE FILE SWAPPING NETWORKS CACHEABLE , 2002 .

[22]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[23]  Azer Bestavros,et al.  Popularity-aware greedy dual-size Web proxy caching algorithms , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[24]  Azer Bestavros,et al.  Sources and characteristics of Web temporal locality , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[25]  Mohamed Hefeeda,et al.  Traffic Modeling and Proportional Partial Caching for Peer-to-Peer Systems , 2008, IEEE/ACM Transactions on Networking.

[26]  Daniel Stutzbach,et al.  Characterizing files in the modern Gnutella network , 2006, Electronic Imaging.

[27]  Ki-Dong Chung,et al.  Popularity-based partial caching for VOD systems using a proxy server , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[28]  Azer Bestavros,et al.  Network-aware partial caching for Internet streaming media , 2003, Multimedia Systems.

[29]  Johan A. Pouwelse,et al.  The Bittorrent P2P File-Sharing System: Measurements and Analysis , 2005, IPTPS.