A Comparative Analysis of Materialized Views Selection and Concurrency Control Mechanisms in NoSQL Databases

Relational databases are well suited for vertical scaling; however, specialized hardware can be expensive. Conversely, NewSQL and NoSQL data stores are designed to scale horizontally. NewSQL databases provide ACID transaction support; however, joins are limited to the partition keys, resulting in restricted query expressiveness. On the other hand, NoSQL databases are designed to scale out on commodity hardware; however, they are limited by slow join performance. Hence, we consider if the NoSQL join performance can be improved while ensuring ACID semantics and without drastically sacrificing write performance, disk utilization and query expressiveness.This paper presents the Synergy system that leverages schema and workload driven mechanisms to identify materialized views, and a specialized concurrency control system on top of a NoSQL database to enable scalable data management with familiar relational conventions. Synergy trades slight write performance degradation and increased disk utilization for faster join performance (compared to standard NoSQL databases) and improved query expressiveness (compared to NewSQL databases).

[1]  Zhou Wei,et al.  Scalable Join Queries in Cloud Data Stores , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[2]  Per-Åke Larson,et al.  Updating derived relations: detecting irrelevant and autonomously computable updates , 1986, VLDB.

[3]  Luping Ding,et al.  Dynamic Materialized Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  Per-Åke Larson,et al.  Query Transformation for PSJ-Queries , 1987, VLDB.

[5]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[6]  Divyakant Agrawal,et al.  ElasTraS: An elastic, scalable, and self-managing transactional database for the cloud , 2013, TODS.

[7]  Beng Chin Ooi,et al.  Towards elastic transactional cloud storage with range query support , 2010, Proc. VLDB Endow..

[8]  Bin Liu,et al.  Automatic entity-grouping for OLTP workloads , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[9]  Michael Stonebraker,et al.  Anti-Caching: A New Approach to Database Management System Architecture , 2013, Proc. VLDB Endow..

[10]  Parag Agrawal,et al.  Asynchronous view maintenance for VLSD databases , 2009, SIGMOD Conference.

[11]  Per-Åke Larson,et al.  Computing Queries from Derived Relations , 1985, VLDB.

[12]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[13]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[14]  Jonathan Goldstein,et al.  Optimizing queries using materialized views: a practical, scalable solution , 2001, SIGMOD '01.

[15]  Zhou Wei,et al.  CloudTPS: Scalable Transactions for Web Applications in the Cloud , 2012, IEEE Trans. Serv. Comput..

[16]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[17]  Jennifer Widom,et al.  Making views self-maintainable for data warehousing , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[18]  Chen Li,et al.  AsterixDB: A Scalable, Open Source BDMS , 2014, Proc. VLDB Endow..

[19]  Tim Kraska,et al.  An evaluation of alternative architectures for transaction processing in the cloud , 2010, SIGMOD Conference.

[20]  Frank Dabek,et al.  Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[21]  Divyakant Agrawal,et al.  G-Store: a scalable data store for transactional multi key access in the cloud , 2010, SoCC '10.

[22]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[23]  Frank Wm. Tompa,et al.  Efficiently updating materialized views , 1986, SIGMOD '86.

[24]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[25]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[26]  Mikhail Bautin,et al.  Storage Infrastructure Behind Facebook Messages: Using HBase at Scale , 2012, IEEE Data Eng. Bull..

[27]  Carlo Curino,et al.  Schism , 2010, Proc. VLDB Endow..

[28]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..