Adapt: adaptive database schema design for multi-tenant applications

Multi-tenant data management is a major application of software as a Service (SaaS). Many companies outsource their data to a third party which hosts a multi-tenant database system to provide data management service. The system should have high performance, low space and excellent scalability. One big challenge is to devise a high-quality database schema. Independent Tables Shared Instances and Shared Tables Shared Instances are two state-of-the-art methods. However, the former has poor scalability, while the latter achieves good scalability at the expense of poor performance and high space overhead. In this paper, we trade-off between the two methods and propose an adaptive database schema design approach to achieve good scalability and high performance with low space. To this end, we identify the important attributes and use them to generate a base table. For other attributes, we construct supplementary tables. We propose a cost-based model to adaptively generate the tables above. Our method has the following advantages. First, our method achieves high scalability. Second, our method can trade-off performance and space requirement. Third, our method can be easily applied to existing databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any schemas and query workloads. Experimental results show our method achieves high performance and good scalability with low space and outperforms state-of-the-art method.