Query Centric Partitioning and Allocation for Partially Replicated Database Systems

A key feature of database systems is to provide transparent access to stored data. In distributed database systems, this includes data allocation and fragmentation. Transparent access introduces data dependencies and increases system complexity and inter-process communication. Therefore, many developers are exchanging transparency for better scalability using sharding and similar techniques. However, explicitly managing data distribution and data flow requires a deep understanding of the distributed system and the data access, and it reduces the possibilities for optimizations. To address this problem, we present an approach for efficient data allocation that features good scalability while keeping the data distribution transparent. We propose a workload-aware, query-centric, heterogeneity-aware analytical model. We formalize our approach and present an efficient allocation algorithm. The algorithm optimizes the partitioning and data layout for local query execution and balances the workload on homogeneous and heterogeneous systems according to the query history. In the evaluation, we demonstrate that our approach scales well in performance for OLTP- and OLAP-style workloads and reduces storage requirements significantly over replicated systems while guaranteeing configurable availability.

[1]  Gustavo Alonso,et al.  Extending DBMSs with satellite databases , 2008, The VLDB Journal.

[2]  Carsten Binnig,et al.  Locality-aware Partitioning in Parallel Database Systems , 2015, SIGMOD Conference.

[3]  Jeffrey F. Naughton,et al.  Resource Bricolage for Parallel Database Systems , 2014, Proc. VLDB Endow..

[4]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[5]  Hans-Georg Beyer,et al.  The Theory of Evolution Strategies , 2001, Natural Computing Series.

[6]  Willy Zwaenepoel,et al.  C-JDBC: Flexible Database Clustering Middleware , 2004, USENIX Annual Technical Conference, FREENIX Track.

[7]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[8]  Stefano Ceri,et al.  Horizontal data partitioning in database design , 1982, SIGMOD '82.

[9]  Domenico Saccà,et al.  Database partitioning in a cluster of processors , 1983, TODS.

[10]  Ishfaq Ahmad,et al.  Evolutionary Algorithms for Allocating Data in Distributed Database Systems , 2004, Distributed and Parallel Databases.

[11]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[12]  Patrick Valduriez,et al.  Principles of Distributed Database Systems, Third Edition , 2011 .

[13]  Eranda C Ela,et al.  Assignment Problems , 1964, Comput. J..

[14]  Kenneth A. Ross,et al.  An Object Placement Advisor for DB2 Using Solid State Storage , 2009, Proc. VLDB Endow..

[15]  Rasmus Resen Amossen Vertical partitioning of relational OLTP databases using integer programming , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[16]  Fernando Pedone,et al.  Tashkent: uniting durability with transaction ordering for high-performance scalable database replication , 2006, EuroSys.

[17]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[18]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[19]  Kjetil Nørvåg,et al.  DYFRAM: dynamic fragmentation and replica management in distributed database systems , 2010, Distributed and Parallel Databases.

[20]  Anastasia Ailamaki,et al.  Efficient use of the query optimizer for automated physical design , 2007, VLDB 2007.

[21]  Abdul Quamar,et al.  SWORD: scalable workload-aware data placement for transactional workloads , 2013, EDBT '13.

[22]  Günter Rudolph,et al.  Contemporary Evolution Strategies , 1995, ECAL.

[23]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[24]  Linda Kato,et al.  A funny thing happened on the way to a billion , 2006, IEEE Data Eng. Bull..

[25]  Ricardo Jiménez-Peris,et al.  Adaptive Middleware for Data Replication , 2004, Middleware.

[26]  Emmanuel Cecchet,et al.  RAIDb: Redundant Array of Inexpensive Databases , 2004, ISPA.

[27]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[28]  Christoforos E. Kozyrakis,et al.  On the energy (in)efficiency of Hadoop clusters , 2010, OPSR.

[29]  Carlo Curino,et al.  Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems , 2012, SIGMOD Conference.

[30]  Wanlei Zhou,et al.  Replication Techniques in Distributed Systems , 1999, Scalable Comput. Pract. Exp..

[31]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[32]  Abhinandan Das,et al.  Automating layout of relational databases , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[33]  Apostolos Syropoulos,et al.  Mathematics of Multisets , 2000, WMP.

[34]  Mauro Dell'Amico,et al.  8. Quadratic Assignment Problems: Algorithms , 2009 .

[35]  Edward G. Coffman,et al.  Approximation algorithms for bin packing: a survey , 1996 .

[36]  Chandra Krintz,et al.  Cache-conscious data placement , 1998, ASPLOS VIII.

[37]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[38]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[39]  Alan L. Cox,et al.  Distributed Versioning: Consistent Replication for Scaling Back-End Databases of Dynamic Content Web Sites , 2003, Middleware.

[40]  Kenneth A. Ross,et al.  Storage Class Memory Aware Data Management , 2010, IEEE Data Eng. Bull..

[41]  Pablo Moscato,et al.  On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts : Towards Memetic Algorithms , 1989 .

[42]  Jim Gray,et al.  A benchmark of NonStop SQL release 2 demonstrating near-linear speedup and scaleup on large databases , 1990, SIGMETRICS '90.

[43]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[44]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[45]  Claudio Scordino,et al.  Energy-Efficient Real-Time Heterogeneous Server Clusters , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[46]  Sameh Elnikety,et al.  Tashkent+: memory-aware load balancing and update filtering in replicated databases , 2007, EuroSys '07.

[47]  Peter M G Apers,et al.  Data allocation in distributed database systems , 1988, TODS.

[48]  Michael Stonebraker,et al.  MapReduce and parallel DBMSs: friends or foes? , 2010, CACM.

[49]  Priya Narasimhan,et al.  Fault Tolerant Approaches for Distributed Real-time and Embedded Systems , 2007, MILCOM 2007 - IEEE Military Communications Conference.

[50]  J. D. Day,et al.  A principle for resilient sharing of distributed resources , 1976, ICSE '76.

[51]  Gerhard Weikum,et al.  Data partitioning and load balancing in parallel disk systems , 1998, The VLDB Journal.

[52]  Kenneth Salem,et al.  Workload-aware storage layout for database systems , 2010, SIGMOD Conference.

[53]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[54]  Gustavo Alonso,et al.  MIDDLE-R: Consistent database replication at the middleware level , 2005, TOCS.

[55]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[56]  Gustavo Alonso,et al.  Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication , 2000, VLDB.

[57]  Stefan Voß,et al.  Meta-heuristics: The State of the Art , 2000, Local Search for Planning and Scheduling.

[58]  Hector Garcia-Molina,et al.  Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems , 2004, VLDB.

[59]  Carlo Curino,et al.  Schism , 2010, Proc. VLDB Endow..