Effective Load-Balancing via Migration and Replication in Spatial Grids

The unprecedented growth of available spatial data at geographically distributed locations coupled with the emergence of grid computing provides a strong motivation for designing a spatial grid which supports fast data retrieval and allows its users to transparently access data of any location from anywhere. This calls for efficient search and load-balancing mechanisms. This paper focusses on dynamic load-balancing in spatial grids via data migration/replication to prevent degradation in system performance owing to severe load imbalance among the nodes. The main contributions of our proposal are as follows. First, we view a spatial grid as comprising several clusters where each cluster is a local area network (LAN) and propose a novel inter-cluster load-balancing algorithm which uses migration/replication of data. Second, we present a novel scalable technique for dynamic data placement that not only improves data availability but also minimizes disruptions and downtime to the system. Our performance study demonstrates the effectiveness of our proposed approach in correcting workload skews, thereby facilitating improvement in system performance. To our knowledge, this work is one of the earliest attempts at addressing load-balancing via both online data migration and replication in grid environments.

[1]  Beng Chin Ooi,et al.  R-tree-based data migration and self-tuning strategies in shared-nothing spatial databases , 2001, GIS '01.

[2]  Christos Faloutsos,et al.  Declustering Spatial Databases on a Multi-Computer Architecture , 1996, EDBT.

[3]  Miron Livny,et al.  A worldwide flock of Condors: Load sharing among workstation clusters , 1996, Future Gener. Comput. Syst..

[4]  Remzi H. Arpaci-Dusseau,et al.  Gathering at the Well: Creating Communities for Grid I/O , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[5]  Gerhard Weikum,et al.  Dynamic file allocation in disk arrays , 1991, SIGMOD '91.

[6]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[7]  Douglas Thain,et al.  The Kangaroo approach to data movement on the Grid , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[8]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[9]  David J. DeWitt,et al.  Partition based spatial-merge join , 1996, SIGMOD '96.

[10]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[11]  Patrick Valduriez,et al.  Prototyping Bubba, A Highly Parallel Database System , 1990, IEEE Trans. Knowl. Data Eng..

[12]  Gerhard Weikum,et al.  Adaptive Load Balancing in Disk Arrays , 1993, FODO.

[13]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[14]  Volker Markl,et al.  Integrating the UB-Tree into a Database System Kernel , 2000, VLDB.