Load Balancing on a Grid Using Data Characteristics

In this paper, we develop an efficient partitioning scheme for a grid environment to increase performance by measuring the characteristics of the data. We design a model that simulates a real life distributed grid environment, and test this model against a synthetic data set. We use public information about the distribution of U.S. zip codes and last names to make an effective partitioning scheme for a database running on a grid environment. We demonstrate that smaller data partitions are more effective at distributing loads evenly across the grid resulting in a quicker response time.