Enhanced Disk-Based Databases Towards Improved Hybrid In-Memory Systems

In-memory database systems are becoming popular due to the availability and affordability of sufficiently large RAM and processors in modern high-end servers with the capacity to manage large in-memory database transactions. While fast and reliable inmemory systems are still being developed to overcome cache misses, CPU/IO bottlenecks and distributed transaction costs, disk-based data stores still serve as the primary persistence. In addition, with the recent growth in multi-tenancy cloud applications and associated security concerns, many organisations consider the trade-offs and continue to require fast and reliable transaction processing of diskbased database systems as an available choice. For these organizations, the only way of increasing throughput is by improving the performance of disk-based concurrency control. This warrants a hybrid database system with the ability to selectively apply an enhanced disk-based data management within the context of inmemory systems that would help improve overall throughput. The general view is that in-memory systems substantially outperform disk-based systems. We question this assumption and examine how a modified variation of access invariance that we call enhanced memory access, (EMA) can be used to allow very high levels of concurrency in the pre-fetching of data in disk-based systems. We demonstrate how this prefetching in disk-based systems can yield close to in-memory performance, which paves the way for improved hybrid database systems. This paper proposes a novel EMA technique and presents a comparative study between disk-based EMA systems and in-memory systems running on hardware configurations of equivalent power in terms of the number of processors and their speeds. The results of the experiments conducted clearly substantiate that when used in conjunction with all concurrency control mechanisms, EMA can increase the throughput of disk-based systems to levels quite close to those achieved by in-memory system. The promising results of this work show that enhanced disk-based systems facilitate in improving hybrid data management within the broader context of in-memory systems. Keywords—Concurrency control, disk-based databases, inmemory systems, enhanced memory access (EMA).

[1]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[2]  Alexander Zeier,et al.  A case for online mixed workload processing , 2010, DBTest '10.

[3]  Carlo Curino,et al.  DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud , 2013, CIDR.

[4]  S. Kaspi Optimizing transaction throughput in databases via an intelligent scheduler , 1997, 1997 IEEE International Conference on Intelligent Processing Systems (Cat. No.97TH8335).

[5]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[6]  Martin L. Kersten,et al.  MonetDB: Two Decades of Research in Column-oriented Database Architectures , 2012, IEEE Data Eng. Bull..

[7]  Alexander Thomasian A Performance Comparison of Locking Methods with Limited Wait Depth , 1997, IEEE Trans. Knowl. Data Eng..

[8]  Sitalakshmi Venkatraman,et al.  A Practical Cloud Services Implementation Framework for E-Businesses , 2013 .

[9]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[10]  Divyakant Agrawal,et al.  Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration , 2011, Proc. VLDB Endow..

[11]  Ilia Petrov,et al.  Data-Intensive Systems on Evolving Memory Hierarchies , 2012, GI-Jahrestagung.

[12]  Alexander Thomasian,et al.  Concurrency control for high contention environments , 1992, TODS.

[13]  Gustavo Alonso,et al.  Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[14]  Dean Jacobs,et al.  Ruminations on Multi-Tenant Databases , 2007, BTW.

[15]  Divyakant Agrawal,et al.  Modular Synchronization in Distributed, Multiversion Databases: Version Control and Concurrency Control , 1993, IEEE Trans. Knowl. Data Eng..

[16]  Samuel Kaspi,et al.  Performance Analysis of Concurrency Control Mechanisms for OLTP Databases , 2014 .

[17]  Jignesh M. Patel,et al.  High-Performance Concurrency Control Mechanisms for Main-Memory Databases , 2011, Proc. VLDB Endow..

[18]  Harumi A. Kuno,et al.  Modern B-tree techniques , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[19]  Miron Livny,et al.  Concurrency control performance modeling: alternatives and implications , 1987, TODS.

[20]  Ali Ghodsi,et al.  Scalable atomic visibility with RAMP transactions , 2014, SIGMOD Conference.

[21]  Radu Stoica,et al.  Identifying hot and cold data in main-memory databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Alexander Thomasian,et al.  Access invariance and its use in high contention environments , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[23]  Hasso Plattner,et al.  A common database approach for OLTP and OLAP using an in-memory column database , 2009, SIGMOD Conference.