Adaptive Load-Balancing for Consistent Hashing in Heterogeneous Clusters

In a distributed system, there is always a question of how to distribute any data across the constituent systems. Consistent hashing is often used to get this system mapping based on a hash function and the data. Hence it becomes solely responsible for the distribution of the workload across the different systems. The default consistent hashing partitioning scheme doesn't factor in the load distribution or the heterogeneity of the cluster (in terms of disk speed, power, network, load, etc.) and hence creates performance bottlenecks. This paper proposes an adaptive partitioning scheme for consistent hashing that factors in the heterogeneity of the different systems and also dynamically adapts to the changing workload resulting in better performance of the system. A prototype was built to test the performance and the system was 20% faster in the benchmark compared to the regular consistent hashing apart from better load distribution across the systems.