NOSQL Design for Analytical Workloads: Variability Matters

Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical level, and consequently the resulting schema is tailored to the dataset and access patterns of the problem in hand. However, we believe that NOSQL design can also benefit from traditional design approaches. In this paper we present a method to design databases for analytical workloads. Starting from the conceptual model and adopting the classical 3-phase design used for relational databases, we propose a novel design method considering the new features brought by NOSQL and encompassing relational and co-relational design altogether.

[1]  Scott Ambler,et al.  Agile Database Techniques: Effective Strategies for the Agile Software Developer , 2003 .

[2]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[3]  Oscar Romero,et al.  DSS from an RE Perspective: A systematic mapping , 2016, J. Syst. Softw..

[4]  Martin Fowler,et al.  NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence , 2012 .

[5]  Alberto Abelló,et al.  Tuning small analytics on Big Data: Data partitioning and secondary indexes in the Hadoop ecosystem , 2015, Inf. Syst..

[6]  Patrick Valduriez,et al.  Principles of Distributed Database Systems, Third Edition , 2011 .

[7]  Erik Meijer,et al.  A co-Relational Model of Data for Large Shared Data Banks , 2011, ECOOP.

[8]  Michael Blaha Patterns of Data Modeling , 2010 .

[9]  Anastasia Ailamaki,et al.  H2O: a hands-free adaptive store , 2014, SIGMOD Conference.

[10]  W. H. Inmon,et al.  Corporate Information Factory , 1998 .

[11]  Patrick Valduriez,et al.  Integrating Big Data and Relational Data with a Functional SQL-like Query Language , 2015, DEXA.

[12]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[13]  Jose-Norberto Mazón,et al.  A Set of QVT Relations to Assure the Correctness of Data Warehouses by Using Multidimensional Normal Forms , 2006, ER.

[14]  Michael R. Blaha On reverse engineering of vendor databases , 1998, Proceedings Fifth Working Conference on Reverse Engineering (Cat. No.98TB100261).

[15]  Paolo Atzeni,et al.  Database Design for NoSQL Systems , 2014, ER.

[16]  Ralph Kimball,et al.  The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses , 1996 .

[17]  Jignesh M. Patel,et al.  Big data and its technical challenges , 2014, CACM.