论文信息 - Data Morphing: An Adaptive, Cache-Conscious Storage Technique

Data Morphing: An Adaptive, Cache-Conscious Storage Technique

The number of processor cache misses has a critical impact on the performance of DBMSs running on servers with large main-memory configurations. In turn, the cache utilization of database systems is highly dependent on the physical organization of the records in main-memory. A recently proposed storage model, called PAX, was shown to greatly improve the performance of sequential file-scan operations when compared to the commonly implemented N-ary storage model. However, the PAX storage model can also demonstrate poor cache utilization for other common operations, such as index scans. Under a workload of heterogenous database operations, neither the PAX storage model nor the N-ary storage model is optimal. In this paper, we propose a flexible data storage technique called Data Morphing. Using Data Morphing, a cache-efficient attribute layout, called a partition, is first determined through an analysis of the query workload. This partition is then used as a template for storing data in a cache-efficient way. We present two algorithms for computing partitions, and also present a versatile storage model that accommodates the dynamic reorganization of the attributes in a file. Finally, we experimentally demonstrate that the Data Morphing technique provides a significant performance improvement over both the traditional N-ary storage model and the PAX model.

Jignesh M. Patel | Richard A. Hankins | J. Patel | R. Hankins

[1] Wan Choi,et al. Design and implementation of the concurrency control manager in the main-memory DBMS Tachyon , 2002, Proceedings 26th Annual International Computer Software and Applications.

[2] S. Bing Yao. An attribute based model for database access cost analysis , 1977, TODS.

[3] Alfonso F. Cardenas. Analysis and performance of inverted data base structures , 1975, CACM.

[4] S. B. Yao,et al. Approximating block accesses in database organizations , 1977, CACM.

[5] G. Rota. The Number of Partitions of a Set , 1964 .

[6] David J. DeWitt,et al. Weaving Relations for Cache Performance , 2001, VLDB.

[7] Peter Boncz,et al. Monet: An Impressionist Sketch of an Advanced Database System , 1994 .

[8] David J. DeWitt,et al. The Wisconsin Benchmark: Past, Present, and Future , 1991, The Benchmark Handbook.

[9] Guido Moerkotte,et al. Efficient Storage of XML Data , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[10] Chandra Krintz,et al. Cache-conscious data placement , 1998, ASPLOS VIII.

[11] A. Odlyzko. Asymptotic enumeration methods , 1996 .

[12] Jeffrey F. Naughton,et al. Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.

[13] Michael Stonebraker,et al. The Asilomar report on database research , 1998, SGMD.

[14] David J. DeWitt,et al. DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[15] Shamkant B. Navathe,et al. Vertical partitioning algorithms for database design , 1984, TODS.

[16] Jack J. Dongarra,et al. A Portable Programming Interface for Performance Evaluation on Modern Processors , 2000, Int. J. High Perform. Comput. Appl..

[17] Martin L. Kersten,et al. Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[18] Raghu Ramakrishnan,et al. Database Management Systems , 1976 .

[19] Setrag Khoshafian,et al. A decomposition storage model , 1985, SIGMOD Conference.

[20] James R. Larus,et al. Cache-conscious structure layout , 1999, PLDI '99.