Efficiency of Data Distribution in BitTorrent-Like Systems

BitTorrent (BT) in practice is a very efficient method to share data over a network of clients. In this paper we extend the recent work of Arthur and Panigrahy [1] on modelling the distribution of individual data blocks in BT systems, aiming at a better understanding of why BT can achieve a high degree of parallelism. In particular, we include in our study several new network features that BT systems are using, as well as different local heuristics for routing data blocks in each client. We conduct simulations to figure out to what extent the new network features and routing heuristics would affect the distribution efficiency. Our findings confirm that for the primitive network setting studied in [1], it does require i¾?(blogn) phases for nclients to download bdata blocks. More interestingly, our work suggests that for the more realistic network setting, the heuristics Random and Rarest Block First both allow nclients to download bblocks in b+ O(logn) phases. We believe that the latter bound better reflects the high degree of parallelism of BT observed in reality. It is also worth-mentioning that b+ lognis the smallest possible number of phases needed; it is interesting to see that some simple local routing heuristics have a performance so close to the optimal.