apt-p2p: A Peer-to-Peer Distribution System for Software Package Releases and Updates

The Internet has become a cost-effective vehicle for software development and release, particular in the free software community. Given the free nature of this software, there are often a number of users motivated by altruism to help out with the distribution, so as to promote the healthy development of this voluntary society. It is thus naturally expected that a peer-to- peer distribution can be implemented, which will scale well with large user bases, and can easily explore the network resources made available by the volunteers. Unfortunately, this application scenario has many unique characteristics, which make a straightforward adoption of existing peer-to-peer systems for file sharing (such as BitTorrent) suboptimal. In particular, a software release often consists of a large number of packages, which are difficult to distribute individually, but the archive is too large to be distributed in its entirety. The packages are also being constantly updated by the loosely-managed developers, and the interest in a particular version of a package can be very limited depending on the computer platforms and operating systems used. In this paper, we propose a novel peer-to-peer assisted distribution system design that addresses the above challenges. It enhances the existing distribution systems by providing compatible and yet more efficient downloading and updating services for software packages. Our design leads to apt-p2p, a practical implementation that extends the popular apt distributor. apt-p2p has been used in conjunction with Debian-based distribution of Linux software packages and is also available in the latest release of Ubuntu. We have addressed the key design issues in apt-p2p, including indexing table customization, response time reduction, and multi-value extension. They together ensure that the altruistic users' resources are effectively utilized and thus significantly reduces the currently large bandwidth requirements of hosting the software, as confirme- d by our existing real user statistics gathered over the Internet.