Analysis of peer-to-peer systems: workload characterization and effects on traffic cacheability

Peer-to-peer file sharing networks have emerged as a new popular application in the Internet scenario. We provide an analytical model of the resource size and of the contents shared at a given node. We also study the composition of the content workload hosted in the Gnutella network over time. Finally, we investigate the negative impact of oversimplified hypotheses (e.g., the use of filenames as resource identifiers) on the potentially achievable hit rate of a file-sharing cache. It is clear from our findings that file sharing traffic can be reduced by using a cache to minimize download time and network usage. The design and tuning of the cache server should take into account the presence of different resources sharing the same name and should consider push-based downloads. Failing to do so can result in reduced effectiveness of the caching mechanism.