Scheduling online repartitioning in OLTP systems

Previous studies on automatic database partitioning mostly focus on optimizing the (re)partitioning scheme for a given database and its query workload, while overseeing the problem about how to efficiently deploy the partition scheme onto the database system, which is, however, often non-trivial and challenging, especially in a distributed OLTP system where repartitioning is expected to take place on-line without interfering the user transactions. In this paper, we propose SOAP, a system framework for <u>s</u>cheduling <u>o</u>nline d<u>a</u>tabase re<u>p</u>artitioning for OLTP workloads. SOAP aims to minimize the time frame of executing the repartition operations while guaranteeing the correctness and performance of user transactions. It models and groups the repartition operations into repartition transactions, and then mixes them with the normal transactions for holistic scheduling optimization. SOAP utilizes a cost-based approach to prioritize the repartition transactions, and leverages a feedback model in control theory to determine in which order and at which frequency the repartition transactions should be scheduled for execution. When the system is under heavy workloads, selected repartition operations would piggyback onto the normal transactions to mitigate the repartitioning overhead. We have built a SOAP prototype on top of PostgreSQL and running at Amazon EC2, and conducted a comprehensive experimental study validating SOAP's significant performance advantages.