Main Memory Adaptive Denormalization

Joins have traditionally been the most expensive database operator, but they are required to query normalized schemas. In turn, normalized schemas are necessary to minimize update costs and space usage. Joins can be avoided altogether by using a denormalized schema instead of a normalized schema; this improves analytical query processing times at the tradeof increased update overhead, loading cost, and storage requirements. In our work, we show that we can achieve the best of both worlds by leveraging partial, incremental, and dynamic denormalized tables to avoid join operators, resulting in fast query performance while retaining the minimized loading, update, and storage costs of a normalized schema. We introduce adaptive denormalization for modern main memory systems. We replace the traditional join operations with efficient scans over the relevant partial universal tables without incurring the prohibitive cost of full denormalization.

[1]  Frederick Reiss,et al.  Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Gustavo Alonso,et al.  Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[3]  Gustavo Alonso,et al.  Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited , 2013, Proc. VLDB Endow..

[4]  F. E. A Relational Model of Data Large Shared Data Banks , 2000 .

[5]  References , 1971 .

[6]  G. Lawrence Sanders,et al.  Denormalization effects on performance of RDBMS , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[7]  Georg Gottlob,et al.  Normalization and optimization of schema mappings , 2009, The VLDB Journal.