growing popularity of peer-to-peer (P2P) systems has necessitated the need for managing huge volumes of data efficiently to ensure acceptable user response times. Dynamically changing popularities of data items and skewed user query patterns in P2P systems may cause some of the peers to become bottlenecks, thereby resulting in severe load imbalance and consequently increased user response times. An effective load-balancing mechanism becomes a necessity in such cases. Such load-balancing can be achieved by efficient online data migration/replication. While much work has been done to harness the huge computing resources of P2P systems for high-performance computing and scientific applications, issues concerning load-balancing with a view towards faster access to data for normal users have not received adequate attention. Notably, the sheer size of P2P networks and the inherent dynamism of the environment pose significant challenges to load-balancing. The main contributions of our proposal are three-fold. First, we view a P2P system as comprising clusters of peers and present techniques for both intra-cluster and inter-cluster load-balancing. Second, we analyze the trade-offs between the options of migration and replication and formulate a strategy based on which the system decides at run-time which option to use. Third, we propose an effective strategy aimed towards automatic self-evolving clusters of peers. Our performance evaluation demonstrates that our proposed technique for inter-cluster load-balancing is indeed effective in improving the system performance significantly. To our knowledge, this work is one of the earliest attempts at addressing load-balancing via both online data migration and replication in P2P environments.
[1]
Hector Garcia-Molina,et al.
Routing indices for peer-to-peer systems
,
2002,
Proceedings 22nd International Conference on Distributed Computing Systems.
[2]
David R. Karger,et al.
Chord: A scalable peer-to-peer lookup service for internet applications
,
2001,
SIGCOMM '01.
[3]
Tom W. Keller,et al.
Data placement in Bubba
,
1988,
SIGMOD '88.
[4]
Antony I. T. Rowstron,et al.
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
,
2001,
Middleware.
[5]
Gerhard Weikum,et al.
Load control in scalable distributed file structures
,
1996,
Distributed and Parallel Databases.
[6]
Miron Livny,et al.
A worldwide flock of Condors: Load sharing among workstation clusters
,
1996,
Future Gener. Comput. Syst..
[7]
Mary Baker,et al.
Peer-to-Peer Caching Schemes to Address Flash Crowds
,
2002,
IPTPS.
[8]
Gerhard Weikum,et al.
Dynamic file allocation in disk arrays
,
1991,
SIGMOD '91.
[9]
Gerhard Weikum,et al.
Snowball: Scalable Storage on Networks of Workstations with Balanced Load
,
1998,
Distributed and Parallel Databases.
[10]
Beng Chin Ooi,et al.
R-tree-based data migration and self-tuning strategies in shared-nothing spatial databases
,
2001,
GIS '01.