Flash crowd in a file sharing system based on random encounters

BitTorrent revolutionized the technique of distributing a very large file to a very large number of recipients. The file is chopped into small chunks that the recipients can immediately upload further. In the original design, a "tracker" keeps certain centralized control over the chunk transfer process. This paper studies a BitTorrent-like "information diffusion" system that has a fully distributed and symmetric architecture. The peers join a Distributed Hash Table -based overlay network and contact each other randomly. This kind of designs have been implemented and analysed recently. A trackerless BitTorrent system has been introduced which can be regarded as one based on random encounters --- the participating nodes contact each other at random and download missing chunks. On the analytical front, Massoulie and Vojnovic showed that a random encounter based system has surprisingly good performance without any chunk preference strategies, with the condition that each peer gets its first chunk from a sufficiently uniform distribution. In this paper, we focus on a scenario where this condition cannot be guaranteed, and show that a "rare chunk phenomenon" easily occurs, if both the encounters and the chunk selection are random. Classic urn models give some mathematical understanding of this phenomenon. We then discuss various techniques for alleviating the rare chunk problem and propose a simple distributed chunk selection policy that reduces the imbalance in the distribution of chunks within the network.